papaya
Class Rank
java.lang.Object
papaya.Rank
public class Rank
- extends Object
Ranking based on the natural ordering on floats for a sequence of data that may also
contain NaNs.
When present, NaNs are treated according to the configured NaNStrategy constants and ties
are handled using the configured tiesStrategy constants as follows:
Strategies for handling NaN values in rank transformations.
- 0 (REMOVED, default) - NaNs are removed before the rank transform is applied
- 1 (MINIMAL) - NaNs are treated as minimal in the ordering, equivalent to
(that is, tied with)
Float.NEGATIVE_INFINITY
.
- 2 (MAXIMAL) - NaNs are treated as maximal in the ordering, equivalent to
Float.POSITIVE_INFINITY
- 3 (FIXED) - NaNs are left "in place," that is the rank transformation is
applied to the other elements in the input array, but the NaN elements
are returned unchanged.
Strategies for handling tied values in rank transformations:
- 0 (AVERAGE, default) - Tied values are assigned the average of the applicable ranks.
For example, (1,3,4,3) is ranked as (1,2.5,4,2.5)
- 1 (MINIMUM) - Tied values are assigned the minimum applicable rank, or the rank
of the first occurrence. For example, (1,3,4,3) is ranked as (1,2,4,2)
- 2 (MAXIMUM) - Tied values are assigned the maximum applicable rank, or the rank
of the last occurrence. For example, (1,3,4,3) is ranked as (1,3,4,3)
- 3 (SEQUENTIAL) - Ties are assigned ranks in order of occurrence in the original array,
for example (1,3,4,3) is ranked as (1,2,4,3)
The defaults are 0 (REMOVED) and 0 (AVERAGE) for the NaNStrategy and TiesStrategy respectively.
Examples:
Input data: (20, 17, 30, 42.3, 17, 50, Float.NaN, Float.NEGATIVE_INFINITY, 17)
|
NaNStrategy | TiesStrategy |
rank(data) |
---|
0 (default = NaNs removed) |
0 (default = ties averaged) |
(5, 3, 6, 7, 3, 8, 1, 3) |
0 (default = NaNs removed) |
1 (MINIMUM) |
(5, 2, 6, 7, 2, 8, 1, 2) |
1 (MINIMAL) |
0 (default = ties averaged) |
(6, 4, 7, 8, 4, 9, 1.5, 1.5, 4) |
1 (MINIMAL) |
2 (MAXIMUM) |
(6, 5, 7, 8, 5, 9, 2, 2, 5) |
2 (MAXIMAL) |
2 (MAXIMUM)/td>
| (5, 4, 6, 7, 4, 8, 9, 1, 4) |
(Code adapted from the org.apache.commons.math.stat.ranking package, and modified extensively).
Method Summary |
static float[] |
rank(float[] data,
int tiesStrategy)
Rank an array (with no NaNs) using the natural ordering on Floats with ties
resolved using tiesStrategy . |
static float[] |
rank(float[] data,
int tiesStrategy,
int nanStrategy)
Rank an array containing NaN values using the natural ordering on Floats, with
NaN values handled according to nanStrategy and ties
resolved using tiesStrategy . |
static float[] |
rank(int[] data,
int tiesStrategy)
Rank an array (with no NaNs) using the natural ordering on Floats with ties
resolved using tiesStrategy . |
static float[] |
rank(int[] data,
int tiesStrategy,
int nanStrategy)
Rank an array containing NaN values using the natural ordering on Floats, with
NaN values handled according to nanStrategy and ties
resolved using tiesStrategy . |
rank
public static float[] rank(int[] data,
int tiesStrategy,
int nanStrategy)
- Rank an array containing NaN values using the natural ordering on Floats, with
NaN values handled according to
nanStrategy
and ties
resolved using tiesStrategy
.
Input values that specify which strategy to use for handling tied values in the
rank transformations:
- 0 (AVERAGE, default) - Tied values are assigned the average of the applicable ranks.
For example, (1,3,4,3) is ranked as (1,2.5,4,2.5)
- 1 (MINIMUM) - Tied values are assigned the minimum applicable rank, or the rank
of the first occurrence. For example, (1,3,4,3) is ranked as (1,2,4,2)
- 2 (MAXIMUM) - Tied values are assigned the maximum applicable rank, or the rank
of the last occurrence. For example, (1,3,4,3) is ranked as (1,3,4,3)
- 3 (SEQUENTIAL) - Ties are assigned ranks in order of occurrence in the original array,
for example (1,3,4,3) is ranked as (1,2,4,3)
Input values that specify which strategy to use for handling NaN values in the
rank transformations:
- 0 (REMOVED, default) - NaNs are removed before the rank transform is applied
- 1 (MINIMAL) - NaNs are treated as minimal in the ordering, equivalent to
(that is, tied with)
Float.NEGATIVE_INFINITY
.
- 2 (MAXIMAL) - NaNs are treated as maximal in the ordering, equivalent to
Float.POSITIVE_INFINITY
- 3 (FIXED) - NaNs are left "in place," that is the rank transformation is
applied to the other elements in the input array, but the NaN elements
are returned unchanged.
If the data array has no NaN values, use rank(float[], int)
instead. It is quicker.
- Parameters:
data
- array to be ranked. This is cast to a float array prior to ranking.nanStrategy
- 0,1,2 or 3 corresponding to the NaN strategy to employ.tiesStrategy
- 0,1,2 or 3 corresponding to the ties strategy to employ.
- Returns:
- array of ranks
rank
public static float[] rank(float[] data,
int tiesStrategy,
int nanStrategy)
- Rank an array containing NaN values using the natural ordering on Floats, with
NaN values handled according to
nanStrategy
and ties
resolved using tiesStrategy
.
Input values that specify which strategy to use for handling tied values in the
rank transformations:
- 0 (AVERAGE, default) - Tied values are assigned the average of the applicable ranks.
For example, (1,3,4,3) is ranked as (1,2.5,4,2.5)
- 1 (MINIMUM) - Tied values are assigned the minimum applicable rank, or the rank
of the first occurrence. For example, (1,3,4,3) is ranked as (1,2,4,2)
- 2 (MAXIMUM) - Tied values are assigned the maximum applicable rank, or the rank
of the last occurrence. For example, (1,3,4,3) is ranked as (1,3,4,3)
- 3 (SEQUENTIAL) - Ties are assigned ranks in order of occurrence in the original array,
for example (1,3,4,3) is ranked as (1,2,4,3)
Input values that specify which strategy to use for handling NaN values in the
rank transformations:
- 0 (REMOVED, default) - NaNs are removed before the rank transform is applied
- 1 (MINIMAL) - NaNs are treated as minimal in the ordering, equivalent to
(that is, tied with)
Float.NEGATIVE_INFINITY
.
- 2 (MAXIMAL) - NaNs are treated as maximal in the ordering, equivalent to
Float.POSITIVE_INFINITY
- 3 (FIXED) - NaNs are left "in place," that is the rank transformation is
applied to the other elements in the input array, but the NaN elements
are returned unchanged.
If the data array has no NaN values, use rank(float[], int)
instead. It is quicker.
- Parameters:
data
- array to be rankednanStrategy
- 0,1,2 or 3 corresponding to the NaN strategy to employ.tiesStrategy
- 0,1,2 or 3 corresponding to the ties strategy to employ.
- Returns:
- array of ranks
rank
public static float[] rank(int[] data,
int tiesStrategy)
- Rank an array (with no NaNs) using the natural ordering on Floats with ties
resolved using
tiesStrategy
.
Input values that specify which strategy to use for handling NaN values in the
rank transformations:
- 0 (AVERAGE, default) - Tied values are assigned the average of the applicable ranks.
For example, (1,3,4,3) is ranked as (1,2.5,4,2.5)
- 1 (MINIMUM) - Tied values are assigned the minimum applicable rank, or the rank
of the first occurrence. For example, (1,3,4,3) is ranked as (1,2,4,2)
- 2 (MAXIMUM) - Tied values are assigned the maximum applicable rank, or the rank
of the last occurrence. For example, (1,3,4,3) is ranked as (1,3,4,3)
- 3 (SEQUENTIAL) - Ties are assigned ranks in order of occurrence in the original array,
for example (1,3,4,3) is ranked as (1,2,4,3)
- Parameters:
data
- array to be ranked. The array is cast to a float array prior to rankingtiesStrategy
- the strategy to employ for ties.
- Returns:
- array of ranks
rank
public static float[] rank(float[] data,
int tiesStrategy)
- Rank an array (with no NaNs) using the natural ordering on Floats with ties
resolved using
tiesStrategy
.
Input values that specify which strategy to use for handling NaN values in the
rank transformations:
- 0 (AVERAGE, default) - Tied values are assigned the average of the applicable ranks.
For example, (1,3,4,3) is ranked as (1,2.5,4,2.5)
- 1 (MINIMUM) - Tied values are assigned the minimum applicable rank, or the rank
of the first occurrence. For example, (1,3,4,3) is ranked as (1,2,4,2)
- 2 (MAXIMUM) - Tied values are assigned the maximum applicable rank, or the rank
of the last occurrence. For example, (1,3,4,3) is ranked as (1,3,4,3)
- 3 (SEQUENTIAL) - Ties are assigned ranks in order of occurrence in the original array,
for example (1,3,4,3) is ranked as (1,2,4,3)
- Parameters:
data
- array to be ranked.tiesStrategy
- the strategy to employ for ties.
- Returns:
- array of ranks
Processing library papaya by
Adila Faruk. (C) 2014