c# 基本类型存储方式的研究

基本单位

二进制，当前的计算机系统使用的基本上是二进制系统。
二进制的单位是位，每一位可以表示2个数： 0或1。
byte（字节）有8位，可以表示的数为2的8次方，即256个数，范围为【0-255】。

数字类型

下面是自己整理的C#类型大小，如果存在错误，望指正。
int 有32位，可以转化为byte[4]。可以表示 4,294,967,296‬（2的32次方的）个数字，范围为【-2,147,483,648，2,147,483,647】

char 类型

说到char类型，首先必须考虑编码，不同的编码方式对应不同的字符。这里的编码方式属于文字编码。

字符串编码是字符和数字的对应关系。一个数字对应一个字符，一个字符对应一个数字。

这个东西源远流长，说清楚不太容易，感兴趣的可以看这篇文章：
https://www.cnblogs.com/criedshy/archive/2012/08/07/2625358.html

只说几点：
1、Unicode是国际组织制定的可以容纳世界上所有文字和符号的字符编码方案。不包括实现。
2、UTF32，UTF7（被淘汰），UTF8，UTF16是Unicode的实现方式。
3、C#中的UniCode指的是UTF-16

4、

下面是一组测试

        public void StartTest()

        {

            Console.WriteLine("Char编码的探讨");

            TestOneEncode("ascii");

            TestOneEncode("gb2312");

            TestOneEncode("unicode");

            TestOneEncode("utf-8");

            TestOneEncode("utf-16");

            TestOneEncode("utf-32");

        }

        private static void TestOneEncode(string encodeName)

        {

            Encoding encoding = Encoding.GetEncoding(encodeName);

            Console.WriteLine($"'a'使用 {encodeName} 编码，每个字节对应的数字为：{ string.Join(",", encoding.GetBytes(new[] { 'a' }))}");

            Console.WriteLine($"'a'使用 {encodeName} 编码,再使用 {encodeName} 解码，得到：{ encoding.GetChars(encoding.GetBytes(new[] { 'a' }))[0]}");

            Console.WriteLine($"'、'使用 {encodeName} 编码，每个字节对应的数字为：{ string.Join(",", encoding.GetBytes(new[] { '、' }))}");

            Console.WriteLine($"'、'使用 {encodeName} 编码,再使用 {encodeName} 解码，得到：{ encoding.GetChars(encoding.GetBytes(new[] { '、' }))[0]}");

            Console.WriteLine($"'中'使用 {encodeName} 编码，每个字节对应的数字为：{ string.Join(",", encoding.GetBytes(new[] { '中' }))}");

            Console.WriteLine($"'中'使用 {encodeName} 编码,再使用 {encodeName} 解码，得到：{ encoding.GetChars(encoding.GetBytes(new[] { '中' }))[0]}");

            Console.WriteLine();

        }

关键部分测试代码，没有什么技术含量

下面是输出

Char编码的探讨
'a'使用 ascii 编码，每个字节对应的数字为：97
'a'使用 ascii 编码,再使用 ascii 解码，得到：a
'、'使用 ascii 编码，每个字节对应的数字为：63
'、'使用 ascii 编码,再使用 ascii 解码，得到：?
'中'使用 ascii 编码，每个字节对应的数字为：63
'中'使用 ascii 编码,再使用 ascii 解码，得到：?

'a'使用 gb2312 编码，每个字节对应的数字为：97
'a'使用 gb2312 编码,再使用 gb2312 解码，得到：a
'、'使用 gb2312 编码，每个字节对应的数字为：161,162
'、'使用 gb2312 编码,再使用 gb2312 解码，得到：、
'中'使用 gb2312 编码，每个字节对应的数字为：214,208
'中'使用 gb2312 编码,再使用 gb2312 解码，得到：中

'a'使用 unicode 编码，每个字节对应的数字为：97,0
'a'使用 unicode 编码,再使用 unicode 解码，得到：a
'、'使用 unicode 编码，每个字节对应的数字为：1,48
'、'使用 unicode 编码,再使用 unicode 解码，得到：、
'中'使用 unicode 编码，每个字节对应的数字为：45,78
'中'使用 unicode 编码,再使用 unicode 解码，得到：中

'a'使用 utf-8 编码，每个字节对应的数字为：97
'a'使用 utf-8 编码,再使用 utf-8 解码，得到：a
'、'使用 utf-8 编码，每个字节对应的数字为：227,128,129
'、'使用 utf-8 编码,再使用 utf-8 解码，得到：、
'中'使用 utf-8 编码，每个字节对应的数字为：228,184,173
'中'使用 utf-8 编码,再使用 utf-8 解码，得到：中

'a'使用 utf-16 编码，每个字节对应的数字为：97,0
'a'使用 utf-16 编码,再使用 utf-16 解码，得到：a
'、'使用 utf-16 编码，每个字节对应的数字为：1,48
'、'使用 utf-16 编码,再使用 utf-16 解码，得到：、
'中'使用 utf-16 编码，每个字节对应的数字为：45,78
'中'使用 utf-16 编码,再使用 utf-16 解码，得到：中

'a'使用 utf-32 编码，每个字节对应的数字为：97,0,0,0
'a'使用 utf-32 编码,再使用 utf-32 解码，得到：a
'、'使用 utf-32 编码，每个字节对应的数字为：1,48,0,0
'、'使用 utf-32 编码,再使用 utf-32 解码，得到：、
'中'使用 utf-32 编码，每个字节对应的数字为：45,78,0,0
'中'使用 utf-32 编码,再使用 utf-32 解码，得到：中

其他内容

1、如果上面用将byte转化为数字觉得不直观，也可以用下面的代码转化为二进制来看。

        private static string ConvertToBinaryString(byte[] bytes)

        {

            return string.Join(",", bytes.Select(c => $"{c / 128}{c % 128 / 64}{c % 128 % 64 / 32}{c % 128 % 64 % 32 / 16}{c % 128 % 64 % 32 % 16 / 8}{c % 128 % 64 % 32 % 16 % 8 / 4}{c % 128 % 64 % 32 % 16 % 8 % 4 / 2}{c % 128 % 64 % 32 % 16 % 8 % 4 % 2}"));

        }

将Byte[]转化为二进制字符串

2、c#中的数字类型设计

数字的存储方式有两种，大端和小端，C#使用小端，如果跨语言交互，需要考虑转换。

具体见百度百科：

https://baike.baidu.com/item/%E5%A4%A7%E5%B0%8F%E7%AB%AF%E6%A8%A1%E5%BC%8F/6750542?fromtitle=%E5%A4%A7%E7%AB%AF%E5%B0%8F%E7%AB%AF&fromid=15925891&fr=aladdin

巴特西

c# 基本类型存储方式的研究

基本单位

数字类型

char 类型

其他内容

最新文章

热门文章