用Python实现Kahn的拓扑排序算法

debugcn 发表于 Dev

维姆

Kahn 在 62 提出了一种算法来对任何 DAG（有向无环图）进行拓扑排序，伪代码复制自维基百科：

L ← Empty list that will contain the sorted elements 
S ← Set of all nodes with no incoming edges 
while S is non-empty do
    remove a node n from S
    add n to tail of L
    for each node m with an edge e from n to m do
        remove edge e from the graph  # This is a DESTRUCTIVE step!
        if m has no other incoming edges then
            insert m into S if graph has edges then
    return error (graph has at least one cycle) else 
    return L (a topologically sorted order)

我需要使用 IPython3 来实现它，并使用以下 DAG 实现：

class Node(object):
    def __init__(self, name, parents):
        assert isinstance(name, str)
        assert all(isinstance(_, RandomVariable) for _ in parents)
        self.name, self.parents = name, parents

其中name是节点的标签并parents存储其所有父节点。然后 DAG 类实现为：

class DAG(object):
    def __init__(self, *nodes):
        assert all(isinstance(_, Node) for _ in nodes)
        self.nodes = nodes

（DAG 实现是固定的，不会改进。）然后我需要将 Kahn 算法实现为一个函数top_order，它接受一个 DAG 实例并返回一个像(node_1, node_2, ..., node_n). 主要的问题是，这个算法是破坏性的，因为它的步骤之一是remove edge e from the graph（第 5 行），它会删除的一个成员m.parents。但是，我必须保持 DAG 实例完好无损。

到目前为止我能想到的一种方法是创建一个被引入的 DAG 实例的深拷贝（即使是浅拷贝也无法完成这项工作，因为算法仍然通过引用破坏了原始实例），并对此执行破坏性算法复制，然后得到这个副本的节点名称的正确顺序（假设节点之间没有命名冲突），然后使用这个名称顺序来推断原始实例的节点的正确顺序，大致如下：

def top_order(network):
    '''takes in a DAG, prints and returns a topological ordering.'''
    assert type(network) == DAG
    temp = copy.deepcopy(network) # to leave the original instance intact

    ordering_name = []
    roots = [node for node in temp.nodes if not node.parents]
    while roots:
        n_node = roots[0]
        del roots[0]
        ordering_name.append(n_node.name)
        for m_node in temp.nodes:
            if n_node in m_node.parents:
                temp_list = list(m_node.parents)
                temp_list.remove(n_node)
                m_node.parents = tuple(temp_list)
                if not m_node.parents:
                    roots.append(m_node)

    print(ordering_name) # print ordering by name

    # gets ordering of nodes of the original instance
    ordering = []
    for name in ordering_name:
        for node in network.nodes:
            if node.name == name:
                ordering.append(node)

    return tuple(ordering)

两个问题：第一，当network大的时候深拷贝会消耗资源；其次，我想要改进我的嵌套for循环，以获得原始实例的顺序。（第二，我想像sorted方法等这样的东西突然出现在我的脑海中。）

有什么建议吗？

蒂姆·彼得斯

我将建议算法的一个不太直接的实现：您根本不需要操作 DAG，您只需要操作有关DAG 的信息。算法需要的唯一“有趣”的事情是从节点到其子节点的映射（与 DAG 实际存储的相反），以及每个节点的父节点的数量。

这些很容易计算，并且可以使用 dicts 将此信息与节点名称相关联（假设所有名称都是不同的 - 如果不是，您可以使用更多代码发明唯一名称）。

那么这应该工作：

def topsort(dag):
    name2node = {node.name: node for node in dag.nodes}
    # map name to number of predecessors (parents)
    name2npreds = {}
    # map name to list of successors (children)
    name2succs = {name: [] for name in name2node}

    for node in dag.nodes:
        thisname = node.name
        name2npreds[thisname] = len(node.parents)
        for p in node.parents:
            name2succs[p.name].append(thisname)

    result = [n for n, npreds in name2npreds.items() if npreds == 0]
    for p in result:
        for c in name2succs[p]:
            npreds = name2npreds[c]
            assert npreds
            npreds -= 1
            name2npreds[c] = npreds
            if npreds == 0:
                result.append(c)

    if len(result) < len(name2node):
        raise ValueError("no topsort - cycle")
    return tuple(name2node[p] for p in result)

这里有一个微妙的点：外部循环result 在迭代时附加到result。那是故意的。效果是result外循环只处理in 中的每个元素，无论元素是在初始中result还是在后来添加。

请注意，当遍历inputDAG和Nodes 时，它们中的任何内容都没有改变。

本文收集自互联网，转载请注明来源。

如有侵权，请联系[email protected] 删除。

编辑于2021-07-11

我来说两句

0条评论

登录后参与评论

来自分类Dev

Related 相关文章

文章

用Python实现Kahn的拓扑排序算法

用Python实现Kahn的拓扑排序算法

拓扑排序python实现中的错误

用卡恩算法求解 CouseSchedule 的拓扑排序中的 indegrees

拓扑排序（卡恩算法）的麻烦

用拓扑排序计数路径

Unix排序算法的实现

排序算法的实现

用Javascript实现算法

Dijkstra的算法与DAG拓扑排序图中的松弛边

使用源删除算法的拓扑排序结果未显示

尝试在Python中实现卡片组排序算法

拓扑排序

MIPS中的排序算法实现

在我的图形实现中找到边数并执行拓扑排序

在我的图形实现中找到边数并执行拓扑排序

用Java实现Pi算法

用Java实现Pi算法

用 Java 实现 Dijkstra 算法

PageRank python实现，算法

dfs和拓扑排序之间有区别吗？是否可以在不使用dfs的情况下实现拓扑排序？

在JavaScript中实现合并排序算法

我的快速排序算法实现中的错误

难以实现计数排序算法

c语言中的排序算法实现

快速排序算法在c中的实现

为什么这种基于DFS的拓扑排序算法有效？

请说明此算法的流程及其工作原理（Knuth拓扑排序）

如何在python tkinter中正确实现气泡排序算法？

我使用python进行拓扑排序的工具中的错误

功能拓扑排序