Calculate mean of HashMap values after insertion

Juergen

I want to efficiently calculate two means of values of a HashMap each time a new key/value pair is inserted.

Suppose we currently have this HashMap<Double, Double>:

3 4
5 6
8 8
1 3
6 8 <- Latest insertion

The latest insertion was the key 6 with value 8.

The first mean to calculate consists of all values which keys are smaller than the inserted one, which is 6.

These are the values 4,6,3 of the keys 3,5,1, so the mean is (4+6+3)/3=4.3...

The second mean is the "opposite", so the mean of all values for all keys greater than 6.

The key 8 with value 1 gives this mean as 8/1=8.

Now, a new key/pair gets inserted:

3 4
5 6
6 8
8 8
1 3
4 9 <- Latest insertion

So again, we need to calculate the mean for all values with keys smaller than 4.

These are the values 4,3 for the keys 3,1, so the "smaller mean" is now (4+3)/2=3.5

The "greater mean" is now (6+8+8)/3=7.3... for the key/value pairs 5/6,6/8,8/8.

A naive implementation might be something like this:

public class CalculateMapMean {

        private double smallerMean = 0.0;
        private double greaterMean = 0.0;

        private HashMap<Double, Double> someMap = new HashMap<Double, Double>();

        public void calculateMeans(double latestInsertedKey) {
            double sumGreater = 0;
            double sumSmaller = 0;
            double sumGreaterCount = 0;
            double sumSmallerCount = 0;
            for (Map.Entry<Double, Double> entry : someMap.entrySet()) {
                double key = entry.getKey();
                double value = entry.getValue();
                if (key > latestInsertedKey) {
                    sumGreater += value;
                    ++sumGreaterCount;
                }
                else if (key < latestInsertedKey) {
                    sumSmaller += value;
                    ++sumSmallerCount;
                }
            }
            if (sumGreaterCount != 0) {
                greaterMean = sumGreater / sumGreaterCount;
            }
            else {
                greaterMean = 0.0;
            }
            if (sumSmallerCount != 0) {
                smallerMean = sumSmaller / sumSmallerCount;
            }
            else {
                smallerMean = 0.0;
            }
        }
    }

The question is if the calculations of the means can be dramatically improved with a TreeMap or another datastrure such that one does not to have iterate over all keys on every insertion.

Is there an elegant way of reusing former calculations?

Vincent van der Weele

The only way I can think of to get below O(n) time for every change to the map is by keeping a balanced binary search tree (BBST) with the keys. In every node you need to keep few extra fields

  • the number of nodes in the subtree rooted at that node
  • the sum of the values of all nodes rooted at that node

Rebalancing a BBST after an insert/delete takes O(log n) time. In that same balance operation you can update the count and sum, also in O(log n) time (since you perform O(log n) operations that take O(1) time).

To get the correct means you need to traverse the tree and add the right counts. Let's give a simple example. Suppose I have the following 7 key-value pairs. I hope you can imagine how the corresponding BBST would look.

(3, 5) (4, 3) (7, 1) (8, 4) (11, 3) (12, 1)(13, 3)

In the root - (8, 4) - the total count and sum is stored: [7, 20]. In the root of the left subtree - (4, 3) - the total count and sum of that subtree is stored: [3, 9]. I draw these extra values now as a function of depth in the tree:

[         7, 20        ]
[   3, 9   ][   3, 7   ]
[1, 5][1, 1][1, 3][1, 3]

Suppose I add a new tuple with key 10 now. I start traversing the tree at the root. Because 8 < 10, I don't need to traverse the left subtree: all keys in that subtree are smaller than 10, so we can use the cached values [3, 9]. For the right subtree, we need to recurse, because some keys might be smaller than 10 and some might be larger. We don't have to traverse the right subtree there, because 12 > 10, so we can use [1, 3] directly.

In every layer of the tree, we can ignore one branch and recurse on the other. Therefore, finding the total value and count for keys smaller than the last inserted key and for keys larger than the last inserted key takes O(log n) time as well.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

Calculate mean of calculated values

From Dev

Calculate mean difference values

From Dev

Calculate the mean of values in a loop

From Dev

Is there a Java HashMap implementation where key-values can't be changed after initial insertion?

From Java

Is the order of values retrieved from a HashMap the insertion order

From Dev

Key object is modified after insertion in HashMap

From Dev

How to find size in Memory after insertion in HashMap?

From Dev

Calculate the mean values if the numbers are same

From Dev

Calculate mean after new recording

From Dev

Get 5 highest values from a hashmap while preserving the insertion order

From Dev

How to calculate mean of every three values of a list

From Dev

Calculate mean for each row containing lists of values

From Dev

Calculate mean change in values one day later

From Dev

Calculate mean cell values over different files

From Dev

Calculate max, min and mean values of element in an array

From Dev

Calculate mean on values in python collections.Counter

From Dev

Calculate mean with a filter on a column's values

From Dev

Calculate mean for only three values per row

From Dev

Pythonic way to calculate the mean and variance of values in Counters

From Dev

Sum tuples values to calculate mean - RDD

From Dev

JavaScript; calculate a mean, excluding certain values

From Dev

How to calculate mean of values per unique class

From Dev

Calculate mean of row after X entries

From Dev

R calculate how many values used to calculate mean in aggregate function

From Dev

Sort after values in an Object inside HashMap

From Dev

Hashmap + Insertion Order

From Dev

Sorting object on insertion in HashMap

From Dev

Duplicate Value insertion in Hashmap

From Dev

How to calculate size in memory after inserting data in HashMap and LinkedHashMap?

Related Related

  1. 1

    Calculate mean of calculated values

  2. 2

    Calculate mean difference values

  3. 3

    Calculate the mean of values in a loop

  4. 4

    Is there a Java HashMap implementation where key-values can't be changed after initial insertion?

  5. 5

    Is the order of values retrieved from a HashMap the insertion order

  6. 6

    Key object is modified after insertion in HashMap

  7. 7

    How to find size in Memory after insertion in HashMap?

  8. 8

    Calculate the mean values if the numbers are same

  9. 9

    Calculate mean after new recording

  10. 10

    Get 5 highest values from a hashmap while preserving the insertion order

  11. 11

    How to calculate mean of every three values of a list

  12. 12

    Calculate mean for each row containing lists of values

  13. 13

    Calculate mean change in values one day later

  14. 14

    Calculate mean cell values over different files

  15. 15

    Calculate max, min and mean values of element in an array

  16. 16

    Calculate mean on values in python collections.Counter

  17. 17

    Calculate mean with a filter on a column's values

  18. 18

    Calculate mean for only three values per row

  19. 19

    Pythonic way to calculate the mean and variance of values in Counters

  20. 20

    Sum tuples values to calculate mean - RDD

  21. 21

    JavaScript; calculate a mean, excluding certain values

  22. 22

    How to calculate mean of values per unique class

  23. 23

    Calculate mean of row after X entries

  24. 24

    R calculate how many values used to calculate mean in aggregate function

  25. 25

    Sort after values in an Object inside HashMap

  26. 26

    Hashmap + Insertion Order

  27. 27

    Sorting object on insertion in HashMap

  28. 28

    Duplicate Value insertion in Hashmap

  29. 29

    How to calculate size in memory after inserting data in HashMap and LinkedHashMap?

HotTag

Archive