Calculate mean, variance, covariance of different length matrices in a split list

chron0x

I have an array of 5 values, consisting of 4 values and one index. I sort and split the array along the index. This leads me to splits of matrices with different lengths. From here on I want to calculate the mean, variance of the fourth values and covariance of the first 3 values for every split. My current approach works with a for loop, which I would like to replace by matrix operations, but I am struggeling with the different sizes of my matrices.

import numpy as np
A = np.random.rand(10,5) 
A[:,-1] = np.random.randint(4, size=10)
sorted_A = A[np.argsort(A[:,4])]
splits = np.split(sorted_A, np.where(np.diff(sorted_A[:,4]))[0]+1)

My current for loop looks like this:

result = np.zeros((len(splits), 5))
for idx, values in enumerate(splits):
    if(len(values))>0:
        result[idx, 0] = np.mean(values[:,3])
        result[idx, 1] = np.var(values[:,3])
        result[idx, 2:5] = np.cov(values[:,0:3].transpose(), ddof=0).diagonal()
    else:
        result[idx, 0] = values[:,3]

I tried to work with masked arrays without success, since I couldn't load the matrices into the masked arrays in a proper form. Maybe someone knows how to do this or has a different suggestion.

Paul Panzer

You can use np.add.reduceat as follows:

>>> idx = np.concatenate([[0], np.where(np.diff(sorted_A[:,4]))[0]+1, [A.shape[0]]])
>>> result2 = np.empty((idx.size-1, 5))
>>> result2[:, 0] = np.add.reduceat(sorted_A[:, 3], idx[:-1]) / np.diff(idx)
>>> result2[:, 1] = np.add.reduceat(sorted_A[:, 3]**2, idx[:-1]) / np.diff(idx) - result2[:, 0]**2
>>> result2[:, 2:5] = np.add.reduceat(sorted_A[:, :3]**2, idx[:-1], axis=0) / np.diff(idx)[:, None]
>>> result2[:, 2:5] -= (np.add.reduceat(sorted_A[:, :3], idx[:-1], axis=0) / np.diff(idx)[:, None])**2
>>> 
>>> np.allclose(result, result2)
True

Note that the diagonal of the covariance matrix are just the variances which simplifies this vectorization quite a bit.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

Pandas - Calculate Mean and Variance

From Dev

Calculate Mean and Variance of function output

From Dev

How to calculate the Standard error from a Variance-covariance matrix?

From Dev

Split pandas list to different column and calculate the counts

From Dev

How to calculate/normalize Zero mean and unit variance

From Dev

Using AWK to calculate mean and variance of columns

From Dev

Dplyr calculate mean and variance without all the data

From Dev

Calculate moments (mean, variance) of distribution in python

From Dev

Pythonic way to calculate the mean and variance of values in Counters

From Dev

How could I calculate the variance with specific mean

From Dev

how to calculate mean of multiple matrices

From Dev

How to calculate covariance in PhP for arrays of different sizes?

From Dev

R: Calculate covariance for a rolling window and for different groups

From Dev

How to store mean vectors and covariance matrices in cells of a data table?

From Dev

Compute mean pairwise covariance between elements in a list

From Dev

covariance and variance flip in scala

From Dev

Variance/Covariance generics in Kotlin

From Dev

Multivariate Sample Variance and Covariance

From Dev

Vectorization for computing variance of a vector split at different points

From Java

How can I calculate the variance of a list in python?

From Dev

How to calculate the variance of an image excluding a list of circles

From Dev

Calculate Mean for elements of a list

From Dev

Mean of each element of a list of matrices

From Dev

Mean matrix out of list of matrices

From Dev

Mean of each element of matrices in a list

From Dev

Resample and select a given number of rows and calculate the mean, variance and confidence intervals?

From Dev

How to calculate mean and variance from pandas datetime object?

From Dev

To calculate the mean and variance of a column of subset of a data using awk

From Dev

How to calculate mean/variance/standard deviation per index of array?

Related Related

  1. 1

    Pandas - Calculate Mean and Variance

  2. 2

    Calculate Mean and Variance of function output

  3. 3

    How to calculate the Standard error from a Variance-covariance matrix?

  4. 4

    Split pandas list to different column and calculate the counts

  5. 5

    How to calculate/normalize Zero mean and unit variance

  6. 6

    Using AWK to calculate mean and variance of columns

  7. 7

    Dplyr calculate mean and variance without all the data

  8. 8

    Calculate moments (mean, variance) of distribution in python

  9. 9

    Pythonic way to calculate the mean and variance of values in Counters

  10. 10

    How could I calculate the variance with specific mean

  11. 11

    how to calculate mean of multiple matrices

  12. 12

    How to calculate covariance in PhP for arrays of different sizes?

  13. 13

    R: Calculate covariance for a rolling window and for different groups

  14. 14

    How to store mean vectors and covariance matrices in cells of a data table?

  15. 15

    Compute mean pairwise covariance between elements in a list

  16. 16

    covariance and variance flip in scala

  17. 17

    Variance/Covariance generics in Kotlin

  18. 18

    Multivariate Sample Variance and Covariance

  19. 19

    Vectorization for computing variance of a vector split at different points

  20. 20

    How can I calculate the variance of a list in python?

  21. 21

    How to calculate the variance of an image excluding a list of circles

  22. 22

    Calculate Mean for elements of a list

  23. 23

    Mean of each element of a list of matrices

  24. 24

    Mean matrix out of list of matrices

  25. 25

    Mean of each element of matrices in a list

  26. 26

    Resample and select a given number of rows and calculate the mean, variance and confidence intervals?

  27. 27

    How to calculate mean and variance from pandas datetime object?

  28. 28

    To calculate the mean and variance of a column of subset of a data using awk

  29. 29

    How to calculate mean/variance/standard deviation per index of array?

HotTag

Archive