Character Mapping Table: this table is modified base on win1251BulgarianCharToOrderMap, so only number <64 is sure valid
BEGIN LICENSE BLOCK ########################
The Original Code is mozilla.org code.
The Initial Developer of the Original Code is Netscape Communications Corporation. Portions created by the Initial Developer are Copyright (C) 1998 the Initial Developer. All Rights Reserved.
Contributor(s):
Jeff Hodges - port to Ruby Mark Pilgrim - port to Python
This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
END LICENSE BLOCK #########################
Big5 frequency table by Taiwan's Mandarin Promotion Council <www.edu.tw:81/mandr/>
128 --> 0.42261 256 --> 0.57851 512 --> 0.74851 1024 --> 0.89384 2048 --> 0.97583
Ideal Distribution Ratio = 0.74851/(1-0.74851) =2.98 Random Distribution Ration = 512/(5401-512)=0.105
Typical Distribution Ratio about 25% of Ideal one, still much higher than RDR
BEGIN LICENSE BLOCK ########################
The Original Code is Mozilla Universal charset detector code.
The Initial Developer of the Original Code is Netscape Communications Corporation. Portions created by the Initial Developer are Copyright (C) 2001 the Initial Developer. All Rights Reserved.
Contributor(s):
Jeff Hodges - port to Ruby Mark Pilgrim - port to Python Shy Shalom - original C code
This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
END LICENSE BLOCK #########################
BEGIN LICENSE BLOCK ########################
The Original Code is mozilla.org code.
The Initial Developer of the Original Code is Netscape Communications Corporation. Portions created by the Initial Developer are Copyright (C) 1998 the Initial Developer. All Rights Reserved.
Contributor(s):
Jeff Hodges - port to Ruby Mark Pilgrim - port to Python
This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
END LICENSE BLOCK #########################
BEGIN LICENSE BLOCK ########################
The Original Code is Mozilla Universal charset detector code.
The Initial Developer of the Original Code is Netscape Communications Corporation. Portions created by the Initial Developer are Copyright (C) 2001 the Initial Developer. All Rights Reserved.
Contributor(s):
Jeff Hodges - port to Ruby Mark Pilgrim - port to Python Shy Shalom - original C code Proofpoint, Inc.
This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
END LICENSE BLOCK #########################
BEGIN LICENSE BLOCK ########################
The Original Code is Mozilla Universal charset detector code.
The Initial Developer of the Original Code is Netscape Communications Corporation. Portions created by the Initial Developer are Copyright (C) 2001 the Initial Developer. All Rights Reserved.
Contributor(s):
Jeff Hodges - port to Ruby Mark Pilgrim - port to Python Shy Shalom - original C code
This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
END LICENSE BLOCK #########################
windows-1255 / ISO-8859-8 code points of interest
BEGIN LICENSE BLOCK ########################
The Original Code is Mozilla Universal charset detector code.
The Initial Developer of the Original Code is Netscape Communications Corporation. Portions created by the Initial Developer are Copyright (C) 2001 the Initial Developer. All Rights Reserved.
Contributor(s):
Jeff Hodges - port to Ruby Mark Pilgrim - port to Python Shy Shalom - original C code
This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
END LICENSE BLOCK #########################
512 --> 0.79 -- 0.79 1024 --> 0.92 -- 0.13 2048 --> 0.98 -- 0.06 6768 --> 1.00 -- 0.02
Ideal Distribution Ratio = 0.79135/(1-0.79135) = 3.79 Random Distribution Ration = 512 / (3755 - 512) = 0.157
Typical Distribution Ratio about 25% of Ideal one, still much higher that RDR
BEGIN LICENSE BLOCK ########################
The Original Code is Mozilla Communicator client code.
The Initial Developer of the Original Code is Netscape Communications Corporation. Portions created by the Initial Developer are Copyright (C) 1998 the Initial Developer. All Rights Reserved.
Contributor(s):
Jeff Hodges - port to Ruby Mark Pilgrim - port to Python
This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
END LICENSE BLOCK #########################
BEGIN LICENSE BLOCK ########################
The Original Code is Mozilla Communicator client code.
The Initial Developer of the Original Code is Netscape Communications Corporation. Portions created by the Initial Developer are Copyright (C) 1998 the Initial Developer. All Rights Reserved.
Contributor(s):
Jeff Hodges Mark Pilgrim - port to Python
This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
END LICENSE BLOCK #########################
BEGIN LICENSE BLOCK ########################
The Original Code is mozilla.org code.
The Initial Developer of the Original Code is Netscape Communications Corporation. Portions created by the Initial Developer are Copyright (C) 1998 the Initial Developer. All Rights Reserved.
Contributor(s):
Jeff Hodges - port to Ruby Mark Pilgrim - port to Python
This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
END LICENSE BLOCK #########################
BEGIN LICENSE BLOCK ########################
The Original Code is Mozilla Universal charset detector code.
The Initial Developer of the Original Code is Netscape Communications Corporation. Portions created by the Initial Developer are Copyright (C) 2001 the Initial Developer. All Rights Reserved.
Contributor(s):
Jeff Hodges - port to Ruby Mark Pilgrim - port to Python Shy Shalom - original C code Proofpoint, Inc.
This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
END LICENSE BLOCK #########################
BEGIN LICENSE BLOCK ########################
The Original Code is mozilla.org code.
The Initial Developer of the Original Code is Netscape Communications Corporation. Portions created by the Initial Developer are Copyright (C) 1998 the Initial Developer. All Rights Reserved.
Contributor(s):
Jeff Hodges - port to Ruby Mark Pilgrim - port to Python
This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
END LICENSE BLOCK #########################
BEGIN LICENSE BLOCK ########################
The Original Code is Mozilla Universal charset detector code.
The Initial Developer of the Original Code is Netscape Communications Corporation. Portions created by the Initial Developer are Copyright (C) 2001 the Initial Developer. All Rights Reserved.
Contributor(s):
Jeff Hodges - port to Ruby Mark Pilgrim - port to Python Shy Shalom - original C code
This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
END LICENSE BLOCK #########################
BEGIN LICENSE BLOCK ########################
The Original Code is mozilla.org code.
The Initial Developer of the Original Code is Netscape Communications Corporation. Portions created by the Initial Developer are Copyright (C) 1998 the Initial Developer. All Rights Reserved.
Contributor(s):
Jeff Hodges - port to Ruby Mark Pilgrim - port to Python
This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
END LICENSE BLOCK #########################
BEGIN LICENSE BLOCK ########################
The Original Code is mozilla.org code.
The Initial Developer of the Original Code is Netscape Communications Corporation. Portions created by the Initial Developer are Copyright (C) 1998 the Initial Developer. All Rights Reserved.
Contributor(s):
Jeff Hodges - port to Ruby Mark Pilgrim - port to Python
This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
END LICENSE BLOCK #########################
128 --> 0.77094 256 --> 0.85710 512 --> 0.92635 1024 --> 0.97130 2048 --> 0.99431
Ideal Distribution Ratio = 0.92635 / (1-0.92635) = 12.58 Random Distribution Ration = 512 / (2965+62+83+86-512) = 0.191
Typical Distribution Ratio, 25% of IDR
128 --> 0.79 256 --> 0.92 512 --> 0.986 1024 --> 0.99944 2048 --> 0.99999
Idea Distribution Ratio = 0.98653 / (1-0.98653) = 73.24 Random Distribution Ration = 512 / (2350-512) = 0.279.
Typical Distribution Ratio
BEGIN LICENSE BLOCK ########################
The Original Code is Mozilla Universal charset detector code.
The Initial Developer of the Original Code is Netscape Communications Corporation. Portions created by the Initial Developer are Copyright (C) 2001 the Initial Developer. All Rights Reserved.
Contributor(s):
Jeff Hodges - port to Ruby Mark Pilgrim - port to Python Shy Shalom - original C code
This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
END LICENSE BLOCK #########################
BEGIN LICENSE BLOCK ########################
The Original Code is Mozilla Communicator client code.
The Initial Developer of the Original Code is Netscape Communications Corporation. Portions created by the Initial Developer are Copyright (C) 1998 the Initial Developer. All Rights Reserved.
Contributor(s):
Jeff Hodges - port to Ruby Mark Pilgrim - port to Python
This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
END LICENSE BLOCK #########################
BEGIN LICENSE BLOCK ########################
The Original Code is Mozilla Communicator client code.
The Initial Developer of the Original Code is Netscape Communications Corporation. Portions created by the Initial Developer are Copyright (C) 1998 the Initial Developer. All Rights Reserved.
Contributor(s):
Jeff Hodges - port to Ruby Mark Pilgrim - port to Python
This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
END LICENSE BLOCK #########################
128 --> 0.42261 256 --> 0.57851 512 --> 0.74851 1024 --> 0.89384 2048 --> 0.97583
Idea Distribution Ratio = 0.74851/(1-0.74851) =2.98 Random Distribution Ration = 512/(5401-512)=0.105
Typical Distribution Ratio about 25% of Ideal one, still much higher than RDR
BEGIN LICENSE BLOCK ########################
The Original Code is mozilla.org code.
The Initial Developer of the Original Code is Netscape Communications Corporation. Portions created by the Initial Developer are Copyright (C) 1998 the Initial Developer. All Rights Reserved.
Contributor(s):
Jeff Hodges - port to Ruby Mark Pilgrim - port to Python
This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
END LICENSE BLOCK #########################
BEGIN LICENSE BLOCK ########################
The Original Code is Mozilla Universal charset detector code.
The Initial Developer of the Original Code is Netscape Communications Corporation. Portions created by the Initial Developer are Copyright (C) 2001 the Initial Developer. All Rights Reserved.
Contributor(s):
Jeff Hodges - port to Ruby Mark Pilgrim - port to Python Shy Shalom - original C code
This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
END LICENSE BLOCK #########################
BEGIN LICENSE BLOCK ########################
The Original Code is Mozilla Communicator client code.
The Initial Developer of the Original Code is Netscape Communications Corporation. Portions created by the Initial Developer are Copyright (C) 1998 the Initial Developer. All Rights Reserved.
Contributor(s):
Jeff Hodges - port to Ruby Mark Pilgrim - port to Python
This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
END LICENSE BLOCK #########################
BEGIN LICENSE BLOCK ########################
The Original Code is mozilla.org code.
The Initial Developer of the Original Code is Netscape Communications Corporation. Portions created by the Initial Developer are Copyright (C) 1998 the Initial Developer. All Rights Reserved.
Contributor(s):
Jeff Hodges - port to Ruby Mark Pilgrim - port to Python
This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
END LICENSE BLOCK #########################
BEGIN LICENSE BLOCK ########################
The Original Code is Mozilla Universal charset detector code.
The Initial Developer of the Original Code is
Simon Montagu
Portions created by the Initial Developer are Copyright (C) 2005 the Initial Developer. All Rights Reserved.
Contributor(s):
Jeff Hodges - port to Ruby Mark Pilgrim - port to Python Shy Shalom - original C code Shoshannah Forbes - original C code (?)
This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
END LICENSE BLOCK #########################
BEGIN LICENSE BLOCK ########################
The Original Code is mozilla.org code.
The Initial Developer of the Original Code is Netscape Communications Corporation. Portions created by the Initial Developer are Copyright (C) 1998 the Initial Developer. All Rights Reserved.
Contributor(s):
Mark Pilgrim - port to Python
This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
END LICENSE BLOCK #########################
BEGIN LICENSE BLOCK ########################
The Original Code is Mozilla Communicator client code.
The Initial Developer of the Original Code is Netscape Communications Corporation. Portions created by the Initial Developer are Copyright (C) 1998 the Initial Developer. All Rights Reserved.
Contributor(s):
Jeff Hodges - port to Ruby Mark Pilgrim - port to Python
This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
END LICENSE BLOCK #########################
BEGIN LICENSE BLOCK ########################
The Original Code is Mozilla Communicator client code.
The Initial Developer of the Original Code is Netscape Communications Corporation. Portions created by the Initial Developer are Copyright (C) 1998 the Initial Developer. All Rights Reserved.
Contributor(s):
Jeff Hodges - port to Ruby Mark Pilgrim - port to Python
This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
END LICENSE BLOCK #########################
BEGIN LICENSE BLOCK ########################
The Original Code is mozilla.org code.
The Initial Developer of the Original Code is Netscape Communications Corporation. Portions created by the Initial Developer are Copyright (C) 1998 the Initial Developer. All Rights Reserved.
Contributor(s):
Jeff Hodges - port to Ruby Mark Pilgrim - port to Python
This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
END LICENSE BLOCK #########################
BEGIN LICENSE BLOCK ########################
The Original Code is Mozilla Communicator client code.
The Initial Developer of the Original Code is Netscape Communications Corporation. Portions created by the Initial Developer are Copyright (C) 1998 the Initial Developer. All Rights Reserved.
Contributor(s):
Jeff Hodges - port to Ruby Mark Pilgrim - port to Python
This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
END LICENSE BLOCK #########################
Char to FreqOrder table
BIG5
Model Table: total sequences: 100% first 512 sequences: 96.9392% first 1024 sequences:3.0618% rest sequences: 0.2992% negative sequences: 0.0020%
EUC-JP
Char to FreqOrder table ,
EUC-KR
Char to FreqOrder table ,
EUC-TW
To be accurate, the length of class 6 can be either 2 or 4. But it is not necessary to discriminate between the two since it is used for frequency analysis only, and we are validing each code range there as well. So it is safe to set it to be 2 here.
GB2312
Model Table: total sequences: 100% first 512 sequences: 98.2851% first 1024 sequences:1.7001% rest sequences: 0.0359% negative sequences: 0.0148%
Model Table: total sequences: 100% first 512 sequences: 98.4004% first 1024 sequences: 1.5981% rest sequences: 0.087% negative sequences: 0.0015%
Model Table: total sequences: 100% first 512 sequences: 94.7368% first 1024 sequences:5.2623% rest sequences: 0.8894% negative sequences: 0.0009%
Char to FreqOrder table ,
KOI8-R language model Character Mapping Table:
0 : illegal 1 : very unlikely 2 : normal 3 : very likely
Character Mapping Table:
Character Mapping Table:
Minimum Visual vs Logical final letter score difference. If the difference is below this, don't rely solely on the final letter score distance.
Minimum Visual vs Logical model score difference. If the difference is below this, don't rely at all on the model score distance.
Model Table: total sequences: 100% first 512 sequences: 97.6601% first 1024 sequences: 2.3389% rest sequences: 0.1237% negative sequences: 0.0009%
Shift_JIS
Character Mapping Table:
Model Table: total sequences: 100% first 512 sequences: 92.6386% first 1024 sequences:7.3177% rest sequences: 1.0230% negative sequences: 0.0436%
UCS2-BE
UCS2-LE
UTF-8
Windows-1255 language model Character Mapping Table:
# File lib/rchardet/charsetgroupprober.rb, line 62 def feed(aBuf) for prober in @_mProbers next unless prober next unless prober.active st = prober.feed(aBuf) next unless st if st == EFoundIt @_mBestGuessProber = prober return get_state() elsif st == ENotMe prober.active = false @_mActiveNum -= 1 if @_mActiveNum <= 0 @_mState = ENotMe return get_state() end end end return get_state() end
# File lib/rchardet/charsetgroupprober.rb, line 53 def get_charset_name if not @_mBestGuessProber get_confidence() return nil unless @_mBestGuessProber # self._mBestGuessProber = self._mProbers[0] end return @_mBestGuessProber.get_charset_name() end
# File lib/rchardet/charsetgroupprober.rb, line 83 def get_confidence() st = get_state() if st == EFoundIt return 0.99 elsif st == ENotMe return 0.01 end bestConf = 0.0 @_mBestGuessProber = nil for prober in @_mProbers next unless prober unless prober.active $stderr << "#{prober.get_charset_name()} not active\n" if $debug next end cf = prober.get_confidence() $stderr << "#{prober.get_charset_name} confidence = #{cf}\n" if $debug if bestConf < cf bestConf = cf @_mBestGuessProber = prober end end return 0.0 unless @_mBestGuessProber return bestConf # else: # self._mBestGuessProber = self._mProbers[0] # return self._mBestGuessProber.get_confidence() end
Generated with the Darkfish Rdoc Generator 2.