+ (NSString *)replaceUnicode:(NSString *)unicodeStr { 

NSString *tempStr1 = [unicodeStrstringByReplacingOccurrencesOfString:@"\\u"withString:@"\\U"];
NSString *tempStr2 = [tempStr1stringByReplacingOccurrencesOfString:@"\""withString:@"\\\""];
NSString *tempStr3 = [[@"\""stringByAppendingString:tempStr2]stringByAppendingString:@"\""];
NSData *tempData = [tempStr3dataUsingEncoding:NSUTF8StringEncoding];
NSString* returnStr = [NSPropertyListSerializationpropertyListFromData:tempData
mutabilityOption:NSPropertyListImmutable
format:NULL
errorDescription:NULL]; return [returnStrstringByReplacingOccurrencesOfString:@"\\r\\n"withString:@"\n"]; }

汉字与utf8相互转化

NSString* strA = [@"%E4%B8%AD%E5%9B%BD"stringByReplacingPercentEscapesUsingEncoding:NSUTF8StringEncoding];
NSString *strB = [@"中国"stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding];

NSString 转化为utf8

NSString *strings = [NSStringstringWithFormat:@"abc"];

NSLog(@"strings : %@",strings);

CF_EXPORT
CFStringRef CFURLCreateStringByAddingPercentEscapes(CFAllocatorRef allocator,CFStringReforiginalString,CFStringRef charactersToLeaved, CFStringReflegalURLCharactersToBeEscaped,CFStringEncoding encoding); NSString *encodedValue = (__bridge NSString*)CFURLCreateStringByAddingPercentEscapes(nil, (__bridgeCFStringRef)strings,nil, (CFStringRef)@"!*'();:@&=+$,/?%#[]",kCFStringEncodingUTF8);

iso8859-1 到 unicode编码转换

+ (NSString *)changeISO88591StringToUnicodeString:(NSString *)iso88591String
{ NSMutableString *srcString = [[[NSMutableString alloc]initWithString:iso88591String] autorelease]; [srcString replaceOccurrencesOfString:@"&" withString:@"&" options:NSLiteralSearch range:NSMakeRange(, [srcString length])];
[srcString replaceOccurrencesOfString:@"&#x" withString:@"" options:NSLiteralSearch range:NSMakeRange(, [srcString length])]; NSMutableString *desString = [[[NSMutableString alloc]init] autorelease]; NSArray *arr = [srcString componentsSeparatedByString:@";"]; for(int i=;i<[arr count]-;i++){ NSString *v = [arr objectAtIndex:i];
char *c = malloc();
int value = [StringUtil changeHexStringToDecimal:v];
c[] = value &0x00FF;
c[] = value >> &0x00FF;
c[] = '\0';
[desString appendString:[NSString stringWithCString:c encoding:NSUnicodeStringEncoding]];
free(c);
} return desString;
}

Q: Is there a standard method to package a Unicode character so it fits an 8-Bit ASCII stream?

A: There are three or four options for making Unicode fit into an 8-bit format.

a) Use UTF-8. This preserves ASCII, but not Latin-1, because the characters >127 are different from Latin-1. UTF-8 uses the bytes in the ASCII only for ASCII characters. Therefore, it works well in any environment where ASCII characters have a significance as syntax characters, e.g. file name syntaxes, markup languages, etc., but where the all other characters may use arbitrary bytes. 
Example: “Latin Small Letter s with Acute” (015B) would be encoded as two bytes: C5 9B.

b) Use Java or C style escapes, of the form \uXXXXX or \xXXXXX. This format is not standard for text files, but well defined in the framework of the languages in question, primarily for source files.
Example: The Polish word “wyjście” with character “Latin Small Letter s with Acute” (015B) in the middle (ś is one character) would look like: “wyj\u015Bcie".

c) Use the &#xXXXX; or &#DDDDD; numeric character escapes as in HTML or XML. Again, these are not standard for plain text files, but well defined within the framework of these markup languages.
Example: “wyjście” would look like “wyjście"

d) Use SCSU. This format compresses Unicode into 8-bit format, preserving most of ASCII, but using some of the control codes as commands for the decoder. However, while ASCII text will look like ASCII text after being encoded in SCSU, other characters may occasionally be encoded with the same byte values, making SCSU unsuitable for 8-bit channels that blindly interpret any of the bytes as ASCII characters.
Example: “ wyjÛcie” where indicates the byte 0x12 and “Û” corresponds to byte 0xDB. [AF] & [KW]

如c所描述,这是一种“未标准"但广泛采用的做法,说是山寨编码也行 :-)

所以编码过程是

字符串 -> Unicode编码 -> &#xXXXX; or &#DDDDD;

解码过程反过来即可

http://unicode.org/faq/utf_bom.html#General

最新文章

  1. 轻量级前端MVVM框架avalon - 执行流程1
  2. 浅谈SQL Server中的三种物理连接操作
  3. CSC321 神经网络语言模型 RNN-LSTM
  4. @RequestMapping(value = &quot;{adminPath}&quot;)
  5. android用讯飞实现TTS语音合成 实现中文版
  6. CSS - DIV标签width根据内容自适应
  7. js中的对象封装
  8. 【UVA10972】RevolC FaeLoN (求边双联通分量)
  9. Git fork指令
  10. 【转】【Android UI设计与开发】之详解ActionBar的使用,androidactionbar
  11. Java 之复合赋值运算符
  12. GetEnvironmentVariable 获取常用系统变量(转)
  13. 缩减APK包大小
  14. Oracle Applications DBA 基础(一)
  15. Cortex-M 实现互斥操作的三种方法
  16. It is difficult to the point of impossiblity for sb to image a time when ...
  17. 005-Spring Boot配置分析-配置文件application、EnvironmentPostProcessor、Profiles
  18. sql 语句按字段指定值排序及分页
  19. tvs二极管应用电路
  20. 快速创建IIS站点并设置权限

热门文章

  1. SpringBoot | 第二十七章:监控管理之Actuator使用
  2. 接收时间戳model [JsonConverter(typeof(UnixDateTimeConverter))]
  3. mvc中RedirectToAction()如何传参?
  4. Spring课程 Spring入门篇 4-6 Spring bean装配之基于java的容器注解说明--@ImportResource和@Value java与properties文件交互
  5. Web前端面试指导(十二):::before 和:before有什么区别?
  6. IOS VFL语言(页面布局)
  7. JQuery前端技术记录
  8. JS:jquery插件表格单元格合并.
  9. kill 使用当前数据库的所有session
  10. POJ 3070 矩阵快速幂