Calculations across more than two different dataframes in R

Pryore

I'm trying to transfer some work previously done in Excel into R. All I need to do is transform two basic count_if formulae into readable R script. In Excel, I would use three tables and calculate across those using 'point-and-click' methods, but now I'm lost in how I should address it in R.

My original dataframes are large, so for this question I've posted sample dataframes:

OperatorData <- data.frame(
                    Operator = c("A","B","C"),
                    Locations = c(850, 575, 2175)
 )

AreaData <- data.frame(
              Area = c("Torbay","Torquay","Tooting","Torrington","Taunton","Torpley"),
              SumLocations = c(1000,500,500,250,600,750)
 )

OperatorAreaData <- data.frame(
              Operator = c("A","A","A","B","B","B","C","C","C","C","C"),
              Area = c("Torbay","Tooting","Taunton",
                       "Torbay","Taunton","Torrington",
                       "Tooting","Torpley","Torquay","Torbay","Torrington"),
              Locations = c(250,400,200,
                            100,400,75,
                            100,750,500,650,175)
 )

What I'm trying to do is add two new columns to the OperatorData dataframe: one indicating the count of Areas that operator operates in and another count indicating how many areas in which that operator operates in and owns more than 50% of locations.

So the new resulting dataframe would look like this

Operator     Locations   AreaCount    Own_GE_50percent
A            850         3            1
B            575         3            1
C            2715        5            4

So far, I've managed to calculate the first column using the table function and then appending:

OpAreaCount <- data.frame(table(OperatorAreaData$Operator))
names(OpAreaCount)[2] <- "AreaCount"
OperatorData$"AreaCount" <- cbind(OpAreaCount$AreaCount)

This is fairly straightforward, but I'm stuck in how to calculate the second column calculation with the condition of 50%.

AntoniosK
library(dplyr)

OperatorAreaData %>%
  inner_join(AreaData, by="Area") %>%
  group_by(Operator) %>%
  summarise(AreaCount = n_distinct(Area),
            Own_GE_50percent = sum(Locations > (SumLocations/2)))

# # A tibble: 3 x 3
#   Operator AreaCount Own_GE_50percent
#   <fct>        <int>            <int>
# 1 A                3                1
# 2 B                3                1
# 3 C                5                4

You can use AreaCount = n() if you're sure you have unique Area values for each Operator.

この記事はインターネットから収集されたものであり、転載の際にはソースを示してください。

侵害の場合は、連絡してください[email protected]

編集
0

コメントを追加

0

関連記事

分類Dev

How To Make More Than Two buttons in Fragment to Open Different Activity

分類Dev

Summing more than two doubles

分類Dev

matching sites in different dataframes in R

分類Dev

MPICH stop running across more than one node

分類Dev

How to get a combined look at two different DataFrames

分類Dev

Union of two Spark dataframes with different columns

分類Dev

Python: Nest more than two types of quotes

分類Dev

Join more than two tables with "not in" clause

分類Dev

Passing more than two properties in object

分類Dev

Calculating values across different subsets in R / dplyr

分類Dev

R markup Loading more than one package

分類Dev

Joining two dataframes with messy column names in R

分類Dev

R: Combine two dataframes by the nearest time

分類Dev

Calculate Recency based on specific conditions across more than one columns - pandas

分類Dev

Get indices of common rows from two different dataframes

分類Dev

Create a new dataframe column by comparing two other columns in different dataframes

分類Dev

How to concatenate combinations of rows from two different dataframes?

分類Dev

Merging two Pandas DataFrames with identical columns as well as different ones

分類Dev

Concatenate more than two tables horizontally in SQL Server

分類Dev

How to pass more than one record between two forms?

分類Dev

Check if a user is assigned to more than one meeting in two consecutive days

分類Dev

Calculating sum of certain values across two columns in R

分類Dev

Removing overly common words (occur in more than 80% of the documents) in R

分類Dev

Give condition for choosing one mode when there are more than one in R

分類Dev

select() selects more columns than I tell it to. Why? - R

分類Dev

Rename sequence of elements if it occurs more than once (R, dplyr)

分類Dev

Add index only if item appears more than once in R

分類Dev

How can I find and replace values between two dataframes in R

分類Dev

Merging two dataframes using closest lower value in R

Related 関連記事

  1. 1

    How To Make More Than Two buttons in Fragment to Open Different Activity

  2. 2

    Summing more than two doubles

  3. 3

    matching sites in different dataframes in R

  4. 4

    MPICH stop running across more than one node

  5. 5

    How to get a combined look at two different DataFrames

  6. 6

    Union of two Spark dataframes with different columns

  7. 7

    Python: Nest more than two types of quotes

  8. 8

    Join more than two tables with "not in" clause

  9. 9

    Passing more than two properties in object

  10. 10

    Calculating values across different subsets in R / dplyr

  11. 11

    R markup Loading more than one package

  12. 12

    Joining two dataframes with messy column names in R

  13. 13

    R: Combine two dataframes by the nearest time

  14. 14

    Calculate Recency based on specific conditions across more than one columns - pandas

  15. 15

    Get indices of common rows from two different dataframes

  16. 16

    Create a new dataframe column by comparing two other columns in different dataframes

  17. 17

    How to concatenate combinations of rows from two different dataframes?

  18. 18

    Merging two Pandas DataFrames with identical columns as well as different ones

  19. 19

    Concatenate more than two tables horizontally in SQL Server

  20. 20

    How to pass more than one record between two forms?

  21. 21

    Check if a user is assigned to more than one meeting in two consecutive days

  22. 22

    Calculating sum of certain values across two columns in R

  23. 23

    Removing overly common words (occur in more than 80% of the documents) in R

  24. 24

    Give condition for choosing one mode when there are more than one in R

  25. 25

    select() selects more columns than I tell it to. Why? - R

  26. 26

    Rename sequence of elements if it occurs more than once (R, dplyr)

  27. 27

    Add index only if item appears more than once in R

  28. 28

    How can I find and replace values between two dataframes in R

  29. 29

    Merging two dataframes using closest lower value in R

ホットタグ

アーカイブ