I want to take the sum of the values of previous dates except current date by Product. Given below is the dataset. In the output table, I have mentioned how my summation of values of previous dates should look like.
For Eg: For the date 7/27/2014 I want the summation of the previous dates except current date by product level. Similarly when you are taking the summation for 7/20/2014 ignore the values for 7/20/2014 and also 7/27/2014 and take rest of the previous dates. Also there is a exception here. When you see duplicates for example 7/20/2014 and 6/8/2014 consider just one value.
DATE DEALID PRODUCT VALUE
7/27/2014 6575 CLIENT 4
7/20/2014 16701 CLIENT 6
7/20/2014 16701 CLIENT 6
7/13/2014 6601 CLIENT 4
7/6/2014 10871 SERVICES 5
6/29/2014 16661 SERVICES 2
6/22/2014 66757 SERVICES 1
6/15/2014 77757 SERVICES 5
6/8/2014 5675 SERVICES 8
6/8/2014 5675 SERVICES 8
5/25/2014 5756 SERVICES 4
output table
DATE DEALID PRODUCT VALUE SUMM
7/27/2014 6575 CLIENT 4 10
7/20/2014 16701 CLIENT 6 4
7/20/2014 16701 CLIENT 6 4
7/13/2014 6601 CLIENT 4 0
7/6/2014 10871 SERVICES 5 20
6/29/2014 16661 SERVICES 2 18
6/22/2014 66757 SERVICES 1 17
6/15/2014 77757 SERVICES 5 12
6/8/2014 5675 SERVICES 8 4
6/8/2014 5675 SERVICES 8 4
5/25/2014 5756 SERVICES 4 0
Plain old SAS data steps are usualy more practical and often run faster for very specific requirements.
Let us start like NEO_mental to create the data
data test;
infile datalines;
format date mmddyy10.; /** Make SAS print dates as a date instead of numbers **/
input date : mmddyy10.
Dealid : $ 8.
PRODUCT : $10.
Value : 8.
;
datalines;
7/27/2014 6575 CLIENT 4
7/20/2014 16701 CLIENT 6
7/20/2014 16701 CLIENT 6
7/13/2014 6601 CLIENT 4
7/6/2014 10871 SERVICES 5
6/29/2014 16661 SERVICES 2
6/22/2014 66757 SERVICES 1
6/15/2014 77757 SERVICES 5
6/8/2014 5675 SERVICES 8
6/8/2014 5675 SERVICES 8
5/25/2014 5756 SERVICES 4
;
run;
To calculate a running total, I sort in ascending date order. I do not remove duplicates, as I handle them in my data step.
proc sort data=test out=ascendingTest;
by Product Date;
run;
Here comes the good old data step, in which I do all the calculations
/** Create a dataset including the running total **/
Data summTest;
/** Read in the data **/
set ascendingTest;
/** Enable things like first.Product and last.Date **/
by Product Date;
/** Create the running total **/
/** variables are initialised for each observation (=row) unless you retain them **/
retain Summ;
if first.Product then Summ = 0; /** Start over for each product **/
/** Write out the result BEFORE increasing the total **/
output;
/** Increase the running total for the later dates **/
if last.Date then Summ = Summ + Value;
run;
As we did not remove duplicates, I do not need to merge, so all I have to do is to sort on descending date agian. Note: if performance is an issue, write 'Data summTest / view=summTest. This way the data step will not read any data and the calculations will only be done when the sort step consumes the results. ;
proc sort data=summTest out=final;
by Product descending Date ;
run;
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments