我在这里提出的递归排序解决方案遇到了问题
我无法使用锚点和子元素对YAML文件进行排序。.pop方法调用引发KeyError异常。
例如:
volvo:
anchor_struct: &anchor_struct
zzz:
val: "bar"
aaa:
val: "foo"
aaa: "Authorization"
zzz: 341
anchr_val: &anchor_val famous_val
lambo:
<<: *anchor_struct
mykey:
myval:
enabled: false
anchor_struct:
<<: *anchor_struct
username: user
anchor_val: *anchor_val
zzz: zorglub
www: web
File "orderYaml.py", line 36, in recursive_sort_mappings
value = s.pop(key)
File "/usr/local/lib/python3.6/dist-packages/ruamel/yaml/comments.py", line 818, in __delitem__
referer.update_key_value(key)
File "/usr/local/lib/python3.6/dist-packages/ruamel/yaml/comments.py", line 947, in update_key_value
ordereddict.__delitem__(self, key)
KeyError: 'aaa'
当YAML文件的锚元素中包含额外的元素(例如此处)时,就会发生此错误
volvo:
anchor_struct: &anchor_struct
extra:
zzz:
val: "bar"
aaa:
val: "foo"
aaa: "Authorization"
zzz: 341
anchr_val: &anchor_val famous_val
lambo:
<<: *anchor_struct
mykey:
myval:
enabled: false
anchor_struct:
<<: *anchor_struct
username: user
anchor_val: *anchor_val
zzz: zorglub
www: web
就像蛋糕上的樱桃一样:有没有一种方法可以在排序之后将锚定义(&...)保留在“ volvo”元素上,因为我想操纵排序结果以始终将“ volvo”元素放在之后治疗。
我的目标是通过以下方式访问此文件:
lambo:
<<: *anchor_struct
anchor_struct:
<<: *anchor_struct
mykey:
myval:
enabled: false
username: user
anchor_val: *anchor_val
www: web
zzz: zorglub
volvo:
aaa: "Authorization"
anchor_struct: &anchor_struct
aaa:
val: "foo"
zzz:
val: "bar"
anchr_val: &anchor_val famous_val
zzz: 341
您看到其他解决方案吗?我的目标是验证我们所有的YAML文件中都遵守字母顺序。
编辑#1:
这是我要达到的目标的另一个示例。
这是预期的输入/输出示例:
输入值
_world:
anchor_struct: &anchor_struct
foo:
val: "foo"
bar:
val: "foo"
string: "string"
newmsg: &newmsg
msg: "msg"
foo: "foo"
new: "new"
anchr_val: &anchor_val famous_val
bool: True
elem2:
myStruct:
<<: *anchor_struct
anchor_val: *anchor_val
<<: *anchor_struct
zzz: zorglub
www: web
anchor_struct:
<<: *anchor_struct
other_elem: "other_elem"
elem1:
<<: *anchor_struct
zzz: zorglub
newmsg:
<<: *newmsg
msg: "msg2"
myStruct:
<<: *anchor_struct
anchor_struct:
second_elem: "second_elem"
<<: *anchor_struct
other_elem: "other_elem"
www: web
anchor_val: *anchor_val
预期产量
_world:
anchor_struct: &anchor_struct
bar:
val: "foo"
foo:
val: "foo"
anchr_val: &anchor_val famous_val
bool: True
newmsg: &newmsg
foo: "foo"
msg: "msg"
new: "new"
string: "string"
elem1:
<<: *anchor_struct
anchor_struct:
<<: *anchor_struct
other_elem: "other_elem"
second_elem: "second_elem"
anchor_val: *anchor_val
myStruct:
<<: *anchor_struct
newmsg:
<<: *newmsg
msg: "msg2"
www: web
zzz: zorglub
elem2:
<<: *anchor_struct
anchor_struct:
<<: *anchor_struct
other_elem: "other_elem"
anchor_val: *anchor_val
myStruct:
<<: *anchor_struct
www: web
zzz: zorglub
解决此类问题的方法是,首先添加期望的和必要的导入,将输入和期望的输出定义为多行字符串,然后diff
向YAML实例添加有用的方法。
测试时,字符串输入比文件更易于使用,因为所有内容都在一个文件中(需要删除一些尾随空格吗?),并且您无法覆盖输入并以与第一个不同的方式开始下一个运行。
import sys
import difflib
import ruamel.yaml
from ruamel.yaml.comments import merge_attrib
yaml_in = """\
_world:
anchor_struct: &anchor_struct
foo:
val: "foo"
bar:
val: "foo"
string: "string"
newmsg: &newmsg
msg: "msg"
foo: "foo"
new: "new"
anchr_val: &anchor_val famous_val
bool: True
elem2:
myStruct:
<<: *anchor_struct
anchor_val: *anchor_val
<<: *anchor_struct
zzz: zorglub
www: web
anchor_struct:
<<: *anchor_struct
other_elem: "other_elem"
elem1:
<<: *anchor_struct
zzz: zorglub
newmsg:
<<: *newmsg
msg: "msg2"
myStruct:
<<: *anchor_struct
anchor_struct:
second_elem: "second_elem"
<<: *anchor_struct
other_elem: "other_elem"
www: web
anchor_val: *anchor_val
"""
yaml_out = """\
_world:
anchor_struct: &anchor_struct
bar:
val: "foo"
foo:
val: "foo"
anchr_val: &anchor_val famous_val
bool: True
newmsg: &newmsg
foo: "foo"
msg: "msg"
new: "new"
string: "string"
elem1:
<<: *anchor_struct
anchor_struct:
<<: *anchor_struct
other_elem: "other_elem"
second_elem: "second_elem"
anchor_val: *anchor_val
myStruct:
<<: *anchor_struct
newmsg:
<<: *newmsg
msg: "msg2"
www: web
zzz: zorglub
elem2:
<<: *anchor_struct
anchor_struct:
<<: *anchor_struct
other_elem: "other_elem"
anchor_val: *anchor_val
myStruct:
<<: *anchor_struct
www: web
zzz: zorglub
"""
def diff_yaml(self, data, s, fnin="in", fnout="out"):
# dump data if necessary and compare with s
inl = [l.rstrip() + '\n' for l in s.splitlines()] # trailing space at end of line disregarded
if not isinstance(data, str):
buf = ruamel.yaml.compat.StringIO()
self.dump(data, buf)
outl = buf.getvalue().splitlines(True)
else:
outl = [l.rstrip() + '\n' for l in data.splitlines()]
diff = difflib.unified_diff(inl, outl, fnin, fnout)
result = True
for line in diff:
sys.stdout.write(line)
result = False
return result
ruamel.yaml.YAML.diff = diff_yaml
yaml = ruamel.yaml.YAML()
# yaml.indent(mapping=4, sequence=4, offset=2)
yaml.boolean_representation = ["False", "True"]
yaml.preserve_quotes = True
然后确保您期望的输出有效,并且可以往返:
dout = yaml.load(yaml_out)
buf = ruamel.yaml.compat.StringIO()
yaml.dump(dout, buf)
assert yaml.diff(dout, yaml_out)
既不给出输出也不给出断言错误(预期输出中包含尾随空格,以及非默认True
布尔值)。如果预期输出不能往返,则ruamel.yaml可能无法转储您的预期输出。
如果您陷入困境,现在可以检查dout
以确定解析后的输入应该是什么样子。
所以现在尝试 recursive_sort
def recursive_sort_mappings(s):
if isinstance(s, list):
for elem in s:
recursive_sort_mappings(elem)
return
if not isinstance(s, dict):
return
for key in sorted(s, reverse=True):
value = s.pop(key)
recursive_sort_mappings(value)
s.insert(0, key, value)
din = yaml.load(yaml_in)
recursive_sort_mappings(din)
yaml.diff(din, yaml_out)
这给出了很多输出,因为recursive_sort_mappings
不知道合并并在所有键上运行,尝试将合并键保持在其原始位置,此外,在弹出键时(在将其重新插入第一个位置之前)如果弹出的值存在于合并映射中,则有些魔术:
--- in
+++ out
@@ -1,8 +1,8 @@
_world:
anchor_struct: &anchor_struct
- bar:
+ bar: &id001
val: "foo"
- foo:
+ foo: &id002
val: "foo"
anchr_val: &anchor_val famous_val
bool: True
@@ -14,24 +14,38 @@
elem1:
<<: *anchor_struct
anchor_struct:
+ bar: *id001
<<: *anchor_struct
+ foo: *id002
other_elem: "other_elem"
second_elem: "second_elem"
anchor_val: *anchor_val
+ bar: *id001
+ foo: *id002
myStruct:
<<: *anchor_struct
+ bar: *id001
+ foo: *id002
newmsg:
<<: *newmsg
+ foo: "foo"
msg: "msg2"
+ new: "new"
www: web
zzz: zorglub
elem2:
- <<: *anchor_struct
anchor_struct:
<<: *anchor_struct
+ bar: *id001
+ foo: *id002
other_elem: "other_elem"
anchor_val: *anchor_val
+ <<: *anchor_struct
+ bar: *id001
+ foo: *id002
myStruct:
<<: *anchor_struct
+ bar: *id001
+ foo: *id002
www: web
zzz: zorglub
为了解决这个问题,您需要做很多事情。首先,您需要放弃.insert(),它模拟(对于Python3内置OrderedDict
)方法定义的Corderdict包ruamel.ordereddict包。此仿真将重新创建OrderedDict,并导致重复。Python3 C实现的功能(不如.insert()
)不那么强大,但是在这种情况下是有用的方法move_to_end
(可.insert()
在ruamel.yaml中的仿真更新中使用)。
其次,您只需要遍历“真实”键,而不需要合并提供的那些键,因此您不能使用for key in
。
第三,如果合并键在其他位置,则需要将其移到映射的顶部。
(level
已添加参数用于调试目的)
def recursive_sort_mappings(s, level=0):
if isinstance(s, list):
for elem in s:
recursive_sort_mappings(elem, level=level+1)
return
if not isinstance(s, dict):
return
merge = getattr(s, merge_attrib, [None])[0]
if merge is not None and merge[0] != 0: # << not in first position, move it
setattr(s, merge_attrib, [(0, merge[1])])
for key in sorted(s._ok): # _ok -> set of Own Keys, i.e. not merged in keys
value = s[key]
# print('v1', level, key, super(ruamel.yaml.comments.CommentedMap, s).keys())
recursive_sort_mappings(value, level=level+1)
# print('v2', level, key, super(ruamel.yaml.comments.CommentedMap, s).keys())
s.move_to_end(key)
din = yaml.load(yaml_in)
recursive_sort_mappings(din)
assert yaml.diff(din, yaml_out)
然后差异不再提供输出。
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句