Python Count Number of Unique Values within Data frame within a group

dnaylor93

I have a data frame named 'sal' that contains salary information for employees across a number of years.

I am trying to calculate the number of job titles that were represented by only one person, in the year 2013. I know, via a manual check the answer to this is 202.

I'm using the following method:

sal[sal['Year'] == 2013]['JobTitle'].nunique()

Data Sample:

    Id  EmployeeName    JobTitle    BasePay OvertimePay OtherPay    Benefits    TotalPay    TotalPayBenefits    Year    Notes   Agency  Status
72926   Gregory P Suhr  Chief of Police 319275.01   0   20007.06    86533.21    339282.07   425815.28   2013        San Francisco   
72927   Joanne M Hayes-White    Chief, Fire Department  313686.01   0   23236   85431.39    336922.01   422353.4    2013        San Francisco   
72928   Samson  Lai Battalion Chief, Fire Suppress  186236.42   131217.63   29648.27    57064.95    347102.32   404167.27   2013        San Francisco   
72929   Ellen G Moffatt Asst Med Examiner   272855.51   23727.91    38954.54    66198.92    335537.96   401736.88   2013        San Francisco   
72930   Robert L Shaw   Dep Dir for Investments, Ret    315572.01   0   0   82849.66    315572.01   398421.67   2013        San Francisco   
72931   David L Franklin    Asst Chf of Dept (Fire Dept)    215265.6    87985.24    30637.48    62890.36    333888.32   396778.68   2013        San Francisco   
72932   Harlan L Kelly-Jr   Executive Contract Employee 313312.52   0   0   82319.51    313312.52   395632.03   2013        San Francisco   
72933   John L Martin   Dept Head V 311758.96   0   1098.64 82476.85    312857.6    395334.45   2013        San Francisco   
72934   Edward D Reiskin    Gen Mgr, Public Trnsp Dept  305307.89   0   0   80860.6 305307.89   386168.49   2013        San Francisco   
72935   Thomas A Siragusa   Asst Chf of Dept (Fire Dept)    215265.6    88028.54    21526.49    61288.58    324820.63   386109.21   2013        San Francisco   
72936   Amy P Hart  Dept Head V 286480.44   0   17188.71    80077.63    303669.15   383746.78   2013        San Francisco   
72937   Yifang  Qian    Senior Physician Specialist 203710  0   119176.84   58810.96    322886.84   381697.8    2013        San Francisco   
72938   Michael J Biel  Deputy Chief 3  278964  0   17587.86    77708.48    296551.86   374260.34   2013        San Francisco   
72939   Raymond A Guzman    Dep Chf of Dept (Fire Dept) 270756.03   0   24181.02    77474.92    294937.05   372411.97   2013        San Francisco   
72940   Marty A Ross    Battalion Chief, Fire Suppress  186236.43   88345.08    38035.09    58991.75    312616.6    371608.35   2013        San Francisco   
72941   Mark A Gonzales Dep Chf of Dept (Fire Dept) 270756.01   0   20236.5 77408.16    290992.51   368400.67   2013        San Francisco   
72942   Mark J Johnson  Battalion Chief, Fire Suppress  186236.41   101466.96   23994.92    56134.3 311698.29   367832.59   2013        San Francisco   
72943   Bryan W Rubenstein  Battalion Chief, Fire Suppress  186236.45   94450.92    30313.49    56508.46    311000.86   367509.32   2013        San Francisco   
72944   Gary L Altenberg    Lieutenant, Fire Suppression    135903.02   163477.81   20994.96    46030.76    320375.79   366406.55   2013        San Francisco   
72945   John J Loftus   Deputy Chief 3  274126.5    0   13358.1 75909.1 287484.6    363393.7    2013        San Francisco   
72946   Edwin M Lee Mayor   285446.37   0   0   77105.29    285446.37   362551.66   2013        San Francisco   
72947   Michael J Morris    Assistant Deputy Chief 2    124054  0   202322.37   35929.84    326376.37   362306.21   2013        San Francisco   
72948   David  Shinn    Deputy Chief 3  278964  0   6428.79 76680.57    285392.79   362073.36   2013        San Francisco   
72949   Arthur W Kenney Asst Chf of Dept (Fire Dept)    213308.64   49139.25    36262.42    60756.95    298710.31   359467.26   2013        San Francisco   
72950   Lorrie A Kalos  Battalion Chief, Fire Suppress  186236.49   87457.68    28003.53    57030.95    301697.7    358728.65   2013        San Francisco   
72951   Lyn  Tomioka    Deputy Chief 3  278964  0   3536.35 76113.13    282500.35   358613.48   2013        San Francisco   
72952   Denise A Schmitt    Deputy Chief 3  278964  0   3536.39 75367.15    282500.39   357867.54   2013        San Francisco   
72953   Rudy J Castellanos  Battalion Chief, Fire Suppress  186236.42   94274.25    19022.95    55351.53    299533.62   354885.15   2013        San Francisco   
72954   Susan  Currin   Adm, SFGH Medical Center    271831.5    0   5000    75511.72    276831.5    352343.22   2013        San Francisco   
72955   Thomas F Abbott Battalion Chief, Fire Suppress  186236.41   84382.38    23279.44    56184.01    293898.23   350082.24   2013        San Francisco   
72956   Naomi M Kelly   Dept Head V 270641.5    0   3000    74867.87    273641.5    348509.37   2013        San Francisco   
72957   Trent E Rhorer  Dept Head V 270641.56   0   3000    74769.34    273641.56   348410.9    2013        San Francisco   
72958   Barbara A Garcia    Dept Head V 270591.04   0   3050.5  74769.33    273641.54   348410.87   2013        San Francisco   
72959   Robert F Postel Asst Chf of Dept (Fire Dept)    212244.54   62490.6 13450.16    58778.57    288185.3    346963.87   2013        San Francisco   
72960   Jeffrey J Barden    Captain, Fire Suppression   155174.49   124293.83   18151.93    49001.55    297620.25   346621.8    2013        San Francisco   

which is returing an incorrect answer of 1051. Could someone explain why the logic I have used is incorrect and an alternate method?

Thanks!!!

dnaylor93

So to answer the question I had my logic wrong:

sal[sal['Year'] == 2013]['JobTitle'].nunique()

will count the number of unique job titles. So if there are 10 people with the job title 'Engineer' it will only count once.

The answer I was looking for was 'the number of job titles that were represented by only one person';

which I found using the solution:

 sum(sal[sal['Year']==2013]['JobTitle'].value_counts()==1)

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

Count unique occurrences within data frame

From Dev

Count unique groups within a pandas data frame

From Dev

Count number of unique combinations of values within a matrix

From Dev

R data.table count unique values within multiple columns by group

From Dev

R count the number of distinct number of values within a group using dplyr

From Dev

In r, how to count the number of unique occurrences within a year with repeated values?

From Dev

How to create a bar plot of the number of unique values within each group

From Dev

Count unique data within a category

From Dev

Replacing a varying number of rows within a group in data frame

From Dev

Expand & refill data.frame by values within group

From Dev

Count number of values within cell

From Dev

R: Add count for unique values within Group, disregarding other variables within dataframe

From Dev

Count the number of unique values by group

From Dev

How to count number of unique lists within list?

From Dev

Is it possible to count the number of unique characters within a cell?

From Dev

Count the number of certain values in a data frame after group by

From Dev

mySQL getting a list of unique values within a group by

From

Count number of rows within each group

From Dev

How to count number of binary observations within a group?

From Dev

Count number of duplicates within each group

From Dev

R: Count Number of Observations within a group

From Java

Count of elements in lists within pandas data frame

From Dev

Count occurrence of a value within a data frame within the rows above it

From Java

how to count values within group using a stream

From Dev

Pandas: Count consecective True values within group

From Dev

Normalizing to average of control group within a data frame

From Python

python, count unique list values of a list inside a data frame

From Dev

How to assign count of unique values to the records in a data frame in python

From Dev

Modifying the values of a variable within a data frame

Related Related

  1. 1

    Count unique occurrences within data frame

  2. 2

    Count unique groups within a pandas data frame

  3. 3

    Count number of unique combinations of values within a matrix

  4. 4

    R data.table count unique values within multiple columns by group

  5. 5

    R count the number of distinct number of values within a group using dplyr

  6. 6

    In r, how to count the number of unique occurrences within a year with repeated values?

  7. 7

    How to create a bar plot of the number of unique values within each group

  8. 8

    Count unique data within a category

  9. 9

    Replacing a varying number of rows within a group in data frame

  10. 10

    Expand & refill data.frame by values within group

  11. 11

    Count number of values within cell

  12. 12

    R: Add count for unique values within Group, disregarding other variables within dataframe

  13. 13

    Count the number of unique values by group

  14. 14

    How to count number of unique lists within list?

  15. 15

    Is it possible to count the number of unique characters within a cell?

  16. 16

    Count the number of certain values in a data frame after group by

  17. 17

    mySQL getting a list of unique values within a group by

  18. 18

    Count number of rows within each group

  19. 19

    How to count number of binary observations within a group?

  20. 20

    Count number of duplicates within each group

  21. 21

    R: Count Number of Observations within a group

  22. 22

    Count of elements in lists within pandas data frame

  23. 23

    Count occurrence of a value within a data frame within the rows above it

  24. 24

    how to count values within group using a stream

  25. 25

    Pandas: Count consecective True values within group

  26. 26

    Normalizing to average of control group within a data frame

  27. 27

    python, count unique list values of a list inside a data frame

  28. 28

    How to assign count of unique values to the records in a data frame in python

  29. 29

    Modifying the values of a variable within a data frame

HotTag

Archive