如果您将一个大整数转换为浮点数会发生什么

debugcn 发表于 Dev

markus_p

这是一个关于使用gcc 4.4将非常大/小的SIGNED整数强制转换为浮点数时究竟发生了什么的一般问题。

进行投射时，我看到一些奇怪的行为。这里有些例子：

必须通过以下方法获得：

float f = (float)x;
unsigned int r;
memcpy(&r, &f, sizeof(unsigned int));

./btest -f float_i2f -1 0x80800001
input:          10000000100000000000000000000001
absolute value: 01111111011111111111111111111111

exponent:       10011101
mantissa:       00000000011111101111111111111111  (right shifted absolute value)

EXPECT:         11001110111111101111111111111111  (sign|exponent|mantissa)
MUST BE:        11001110111111110000000000000000  (sign ok, exponent ok,
                                                     mantissa???)

./btest -f float_i2f -1 0x3f7fffe0

EXPECT:    01001110011111011111111111111111
MUST BE:   01001110011111100000000000000000

./btest -f float_i2f -1 0x80004999                                                                  


EXPECT:    11001110111111111111111101101100
MUST BE:   11001110111111111111111101101101    (<- 1 added at the end)

因此，令我困扰的是，在两个示例中，尾数都是不同的，然后我只是将整数值向右移动。例如，末尾的零。他们来自哪里？

我只在大/小值上看到此行为。范围为-2 ^ 24、2 ^ 24的值可以正常工作。

我想知道是否有人可以启发我这里发生的事情。采取什么步骤来实现非常大/小的价值。

这是对：函数的一个附加问题：函数将float转换为int（巨大的整数），这不像这里的一般。

编辑代码：

unsigned float_i2f(int x) {
  if (x == 0) return 0;
  /* get sign of x */
  int sign = (x>>31) & 0x1;

  /* absolute value of x */
  int a = sign ? ~x + 1 : x;

  /* calculate exponent */
  int e = 158;
  int t = a;
  while (!(t >> 31) & 0x1) {
    t <<= 1;
    e--;
  };

  /* calculate mantissa */
  int m = (t >> 8) & ~(((0x1 << 31) >> 8 << 1));
  m &= 0x7fffff;

  int res = sign << 31;
  res |= (e << 23);
  res |= m;

  return res;
}

编辑2：

在亚当斯（Adams）评论和引用《写伟大的代码》一书之后，我用四舍五入更新了我的例程。我仍然遇到一些舍入错误（幸运的是现在只有1位错误）。

现在，如果我进行测试运行，我将获得大部分良好的结果，但会出现一些四舍五入的错误，如下所示：

input:  0xfefffff5
result: 11001011100000000000000000000101
GOAL:   11001011100000000000000000000110  (1 too low)

input:  0x7fffff
result: 01001010111111111111111111111111
GOAL:   01001010111111111111111111111110  (1 too high)

unsigned float_i2f(int x) {
  if (x == 0) return 0;
  /* get sign of x */
  int sign = (x>>31) & 0x1;

  /* absolute value of x */
  int a = sign ? ~x + 1 : x;

  /* calculate exponent */
  int e = 158;
  int t = a;
  while (!(t >> 31) & 0x1) {
    t <<= 1;
    e--;
  };

  /* mask to check which bits get shifted out when rounding */
  static unsigned masks[24] = {
    0, 1, 3, 7, 
    0xf, 0x1f, 
    0x3f, 0x7f, 
    0xff, 0x1ff, 
    0x3ff, 0x7ff, 
    0xfff, 0x1fff, 
    0x3fff, 0x7fff, 
    0xffff, 0x1ffff, 
    0x3ffff, 0x7ffff, 
    0xfffff, 0x1fffff, 
    0x3fffff, 0x7fffff
  };

  /* mask to check wether round up, or down */
  static unsigned HOmasks[24] = {
    0,
    1, 2, 4, 0x8, 0x10, 0x20, 0x40, 0x80,
    0x100, 0x200, 0x400, 0x800, 0x1000, 0x2000, 0x4000, 0x8000, 0x10000, 0x20000, 0x40000, 0x80000, 0x100000, 0x200000, 0x400000
  };

  int S = a & masks[8];
  int m = (t >> 8) & ~(((0x1 << 31) >> 8 << 1));
  m &= 0x7fffff;

  if (S > HOmasks[8]) {
    /* round up */
    m += 1;
  } else if (S == HOmasks[8]) {
    /* round down */
    m = m + (m & 1);
  }

  /* special case where last bit of exponent is also set in mantissa
   * and mantissa itself is 0 */
  if (m & (0x1 << 23)) {
    e += 1;
    m = 0;
  }

  int res = sign << 31;
  res |= (e << 23);
  res |= m;
  return res;
}

有人知道问题出在哪里吗？

亚当·科西里克（Adam Kosiorek）

C / C ++浮点数倾向于与IEEE 754浮点标准（例如，在gcc中）兼容。零来自舍入规则。

将数字向右移动会使右侧的一些位消失。我们称他们为guard bits。现在，我们将其称为数字HO的最高位和LO最低位。现在假设guard bits仍然是我们数字的一部分。例如，如果我们有3，guard bits则意味着我们的LO位的值为8（如果已设置）。现在，如果：

值guard bits> 0.5 *值LO

将数字四舍五入为可能的较小较大值，而忽略符号
'保护位'的值== 0.5 *的值 LO
- 如果LO== 0，则使用当前数字值
- 数字+ = 1否则
值guard bits<0.5 *值LO
- 使用当前数字值

为什么3个保护位表示LO值为8？

假设我们有一个二进制的8位数字：

weights:    128 64 32 16 8 4 2 1
binary num:   0  0  0  0 1 1 1 1

让我们将其右移3位：

weights:      x x x 128 64 32 16 8 | 4 2 1
binary num:   0 0 0   0  0  0  0 1 | 1 1 1

如您所见，具有3个保护位，该LO位结束于第4个位置，权重为8。这仅适用于舍入目的。之后必须对权重进行“归一化”，以便LO位的权重再次变为1。

以及如果保护位> 0.5 * value，我该如何用位操作进行检查？

最快的方法是使用查找表。假设我们正在处理一个8位数字：

unsigned number;          //our number
unsigned bitsToShift;     //number of bits to shift

assert(bitsToShift < 8);  //8 bits

unsigned guardMasks[8] = {0, 1, 3, 7, 0xf, 0x1f, 0x3f}
unsigned LOvalues[8] = {0, 1, 2, 4, 0x8, 0x10, 0x20, 0x40} //divided by 2 for faster comparison

unsigned guardBits = number & guardMasks[bitsToShift]; //value of the guard bits
number = number >> bitsToShift;

if(guardBits > LOvalues[bitsToShift]) {
...
} else if (guardBits == LOvalues[bitsToShift]) {
...
} else { //guardBits < LOvalues[bitsToShift]
...
}

参考：兰德尔·海德（Randall Hyde）撰写伟大的代码，第1卷

本文收集自互联网，转载请注明来源。

如有侵权，请联系[email protected] 删除。