Power BI 和 R 中具有来自多个字段的主键的关系数据

韦斯特兰

背景:

这个问题可能有点宽泛,但希望任何使用关系数据、R、Power BI 或以上所有内容的人都会感兴趣。

我试图重新创建数据集关系模型nycflights13书中所描述的R用科学的数据通过Wickham and Grolemund我正在尝试同时使用 R 和 Power BI。数据集由5个表airlinesariportsflightsweatherplanes在部分中13.2 nycflights13有一段话指出:

航班通过起点(位置)和年、月、日和小时(时间)连接到天气。

关系如下图所示:

在此处输入图片说明

问题 1:如何在 Power BI 中设置此模型?

使用以下 R 脚本将使文件夹中的数据集可用于 Power BI c:/data

# install.packages("tidyverse")
# install.packages("nycflights13")

library(tidyverse)
library(nycflights13)
setwd("C:/data/")
#getwd()

airlines
df_airlines <- data.frame(airlines)
df_airports <- data.frame(airports)
df_planes <- data.frame(planes)
df_weather <- data.frame(weather)
df_flights <- data.frame(flights)

write.csv(df_airlines, file = "C:/data/airlines.txt", row.names = FALSE)
write.csv(df_airports, file = "C:/data/airports.txt", row.names = FALSE)
write.csv(df_planes, file = "C:/data/planes.txt", row.names = FALSE)
write.csv(df_weather, file = "C:/data/weather.txt", row.names = FALSE)
write.csv(df_flights, file = "C:/data/flights.txt", row.names = FALSE)

在 Power BI 中导入表后,我正在尝试建立以下关系Relationships tab

在此处输入图片说明

而我能够这样做在一定程度上,但是当我尝试连接flightsweather使用例如year,我收到以下错误信息:

您无法在这两个列之间创建关系,因为其中一列必须具有唯一值。

而且我知道发生这种情况是因为主键必须包含唯一值并且不能包含空值但是,如何在 Power BI 中建立由多个字段组成的主键?

Question 2: If there's no answer to question 1, how can you do this in R instead?

I really love this book, and It may even be described there already, but how do you establish a relationship like this in R? Or perhaps you don't need to since you can join on multiple columns or composite key using dplyr without there being 'established' a relatinship at all?

Put another way, are the relationship illustrated by the figure aboe with the arrows:

在此处输入图片说明

and in Power BI with the lines:

在此处输入图片说明

really not necessary in R as long as you have the required verbs and there actually does exist a relatinship between the data in the different tables?

Question 3 - Why is flight highlighted in the flights table:

I thought that a highlighted column name indicated that there had been established a connection between tables using that column. But as far as I can tell, that is not the case here, and there is no arrow pointing to it:

在此处输入图片说明

Does it perhaps indicate that it is a primary key in the flights table without any connection to another table?


I know this is a bit broad, but I'm really curious about these things so I'm hoping some of you will find it interesting!

RADO

I can comment on Power BI part.

The key issue here is that Power BI requires Dimensional Model, not relational one. There is a huge difference.

As described, the model from the book is not suitable for BI tools, it must be redesigned. For example, table "Weather" in the book is presented as a "dimension", while in reality it must be a fact table (similar to table "Flights"). As a result, "Flights" and "Weather" should never have direct connections - they must share common dimensions, such as:

  • Airport
  • Airline
  • Plane
  • Date
  • Time

Similarly, multiple keys and multiple connections between tables are very rare exceptions and are frowned upon (usually, they are indications of design mistakes). In a properly designed model, you should never see them.

If you want to understand the issue more, read this book: Star Schema Complete Reference

为了具体回答您的 Q3,在维度建模中,“航班”(我假设它是航班号)被称为“退化维度”。通常,它本来是维度表的键,但如果它不存在,它将作为孤立键保留在事实表中。这种情况常见于订单号、发票号等。

退化维度

总体而言,您走在正确的轨道上 - 如果您弄清楚如何将书中的模型转换为合适的星型模式,然后在 R 和 PowerBI 中使用它,您会对新功能印象深刻 - 这是值得的。

本文收集自互联网,转载请注明来源。

如有侵权,请联系[email protected] 删除。

编辑于
0

我来说两句

0条评论
登录后参与评论

相关文章

来自分类Dev

Power BI和JSON数据

来自分类Dev

Power BI和Power Automation中的Python脚本

来自分类Dev

R脚本未在Power BI中运行

来自分类Dev

在BI和AWS中部署了Power BI?

来自分类Dev

Power BI细分和总计

来自分类Dev

如何在Power BI中连接流数据和显示

来自分类Dev

Power BI,DAX,多对一和关系表

来自分类Dev

如何从 Power BI 中的 r 脚本中运行 r 脚本?

来自分类Dev

REST API和PowerBI Designer的Power BI数据集

来自分类Dev

Power BI中的MAX功能和分组依据

来自分类Dev

Power BI中的SWITCH True Multiple列和条件

来自分类Dev

Power BI 中的计数和搜索相结合

来自分类Dev

更新 R 脚本生成的 Power BI 数据

来自分类Dev

Power BI中R Visual的第一步

来自分类Dev

为BI提供Power BI多个条件

来自分类Dev

Power Bi Desktop中的Power BI Rest Api数据集

来自分类Dev

Power BI报表共享和导入模式

来自分类Dev

Microsoft Power BI:RANKX和结果汇总

来自分类Dev

在Power bi中使用OR和AND功能

来自分类Dev

Azure ApplicationInsight 和 Power Bi - 实时流

来自分类Dev

在Power Bi中取消数据透视表和单独的整数和字符串

来自分类Dev

在 Power BI 中发布数据

来自分类Dev

Power BI - 每月显示数据

来自分类Dev

Power BI 数据类型

来自分类Dev

Office 365中的Power BI

来自分类Dev

Power BI 中的“子查询”

来自分类Dev

Power BI 中的累计计数

来自分类Dev

Power BI 中缺少菜单

来自分类Dev

Power BI 中的减法计数