有没有一种标准的方法来重建降低的结构函数参数？

Necto 发表于 Dev

层

我有一个结构类型：

typedef struct boundptr {
  uint8_t *ptr;
  size_t size;
} boundptr;

我想捕获该类型函数的所有参数。例如此功能：

boundptr sample_function_stub(boundptr lp, boundptr lp2);

在我的64位计算机上，Clang将该签名转换为：

define { i8*, i64 } @sample_function_stub(i8* %lp.coerce0, i64 %lp.coerce1, i8* %lp2.coerce0, i64 %lp2.coerce1) #0 {

问题：

有没有更好的方法来重构这些论点？

是否可以禁止此类参数降低，同时为外部调用保留相同的ABI？

更多背景信息：

因此，我猜在LLVM IR中，根据平台ABI，编译器将结构分解为单独的字段（这不是最坏的情况，请参见1）。顺便说一句，它重建原始的两个参数lp和lp2函数体后。

现在，我的分析，我想这两个参数lp，并lp2在充分，这些4（，，和）。在这种情况下，我可能可以依靠名称（意思是第一个字段-第二个）。lp.coerce0lp.coerce1lp2.coerce0lp2.coerce1.coerce0.coerce1

我不喜欢这种方法：

我不确定Clang是否在以后的版本中保留此约定
当然，这取决于ABI，因此在另一个平台上可能会有不同的故障。

另一方面，我不能在函数的开头使用重建代码，因为我可能会将其与局部变量的某些用户代码混淆。

我使用3.4.2基于LLVM的Clang3.4.2作为target x86_64-pc-linux-gnu。

PS这是另一个示例，显示了Clang如何将函数参数弄乱。

迈克尔·海德尔

我认为您不使用进行编译O0。AFAIK，当您不优化代码时，clang将重新组合原始类型。Clang分解了您的结构，以将它们通过寄存器（至少在x86上）传递给被调用的函数。如您所说，这取决于所使用的ABI。

这是您的用例中的一个虚拟示例：

#include <cstddef>

typedef struct boundptr {
  void *ptr;
  size_t size;
} boundptr;

boundptr foo(boundptr ptr1, boundptr ptr2) { return {ptr1.ptr, ptr2.size}; }

int main() {
  boundptr p1, p2;
  boundptr p3 = foo(p1, p2);
  return 0;
}

用clang -O0 -std=c++11 -emit-llvm -S -c test.cppgenerate编译它foo：

define { i8*, i64 } @_Z3foo8boundptrS_(i8* %ptr1.coerce0, i64 %ptr1.coerce1, i8* %ptr2.coerce0, i64 %ptr2.coerce1) #0 {
  %1 = alloca %struct.boundptr, align 8
  %ptr1 = alloca %struct.boundptr, align 8
  %ptr2 = alloca %struct.boundptr, align 8
  %2 = bitcast %struct.boundptr* %ptr1 to { i8*, i64 }*
  %3 = getelementptr { i8*, i64 }, { i8*, i64 }* %2, i32 0, i32 0
  store i8** %ptr1.coerce0, i8** %3
  %4 = getelementptr { i8*, i64 }, { i8*, i64 }* %2, i32 0, i32 1
  store i64 %ptr1.coerce1, i64* %4
  %5 = bitcast %struct.boundptr* %ptr2 to { i8*, i64 }*
  %6 = getelementptr { i8*, i64 }, { i8*, i64 }* %5, i32 0, i32 0
  store i8** %ptr2.coerce0, i8** %6
  %7 = getelementptr { i8**, i64 }, { i8**, i64 }* %5, i32 0, i32 1
  store i64 %ptr2.coerce1, i64* %7
  %8 = getelementptr inbounds %struct.boundptr, %struct.boundptr* %1, i32 0, i32 0
  %9 = getelementptr inbounds %struct.boundptr, %struct.boundptr* %ptr1, i32 0, i32 0
  %10 = load i8*, i8** %9, align 8
  store i8* %10, i8** %8, align 8
  %11 = getelementptr inbounds %struct.boundptr, %struct.boundptr* %1, i32 0, i32 1
  %12 = getelementptr inbounds %struct.boundptr, %struct.boundptr* %ptr2, i32 0, i32 1
  %13 = load i64, i64* %12, align 8
  store i64 %13, i64* %11, align 8
  %14 = bitcast %struct.boundptr* %1 to { i8*, i64 }*
  %15 = load { i8*, i64 }, { i8*, i64 }* %14, align 8
  ret { i8*, i64 } %15
}

boundptr 在被调用函数堆栈上重构（这也取决于所使用的调用约定）。

现在，要找出boundptr您要使用的参数，您可以执行以下操作：

访问alloca您的通行证中的每个机构并关注其用户。
按照alloca和GEP指示进行强制转换，在上找到商店指示boundptr。
检查要存储的值。如果它们是您的函数参数，并且匹配类型和名称，那么您已经找到了重新组合的代码boundptr。

当然，您可以从functions参数开始以另一种方式进行操作。

这是未来的证明吗？绝对不是。Clang / LLVM并非旨在保持向后兼容性。对于兼容性，ABI很重要。

缺点：您必须在代码生成后的很早就进入优化器。甚至01会删除的这些堆栈分配boundptr。因此，您必须修改自己clang才能在优化过程中执行过程，并且不能使其成为独立过程（例如，由使用opt）。

更好的解决方案：由于必须以某种方式修改clang，因此您可以添加标识您boundptr类型的元数据。因此，您可以将您的片段“打包”boundptr在一起，以将其标识为boundptr。这将使优化程序幸免。

本文收集自互联网，转载请注明来源。

如有侵权，请联系[email protected] 删除。

编辑于2021-02-25

我来说两句

0条评论

登录后参与评论

来自分类Dev

Related 相关文章

文章