Halve a Data Set¶

partition(data)¶

Splits an unsorted data set into two unsorted data sets, each containing the same amount of elements, with elements allocated soley according to where they happen to appear in the original data set (in sets with an odd amount of elements, the median is not included in either half)

Parameters

data (list of int or float) – List of numbers to analyze

Returns

sections[‘upper’] (list of int or float) – List of all elements from the upper half of a data set
sections[‘lower’] (list of int or float) – List of all elements from the lower half of a data set

See also

single_dimension(), unite_vectors()

Notes

Set of numbers: \(a_i = \{ a_1, a_2, \cdots, a_n \}\)
For sets with an odd amount of numbers:
- Lower section: \(a_{lower} = \{ a_1, a_2, \cdots, a_{\lfloor n/2 \rfloor} \}\)
- Upper section: \(a_{upper} = \{ a_{\lceil n/2 \rceil}, a_{\lceil n/2 \rceil + 1}, \cdots, a_n \}\)
For sets with an even amount of numbers:
- Lower section: \(a_{lower} = \{ a_1, a_2, \cdots, a_{n/2} \}\)
- Upper section: \(a_{upper} = \{ a_{n/2 + 1}, a_{n/2 + 2}, \cdots, a_n \}\)

Examples

Import partition function from regressions library

>>> from regressions.statistics.halve import partition

Determine the upper half of the set [5, 2, 9, 8]

>>> sections_short = partition([5, 2, 9, 8])
>>> print(sections_short['upper'])
[9, 8]

Determine the lower half of the set [11, 3, 52, 25, 21, 25, 6]

>>> sections_long = partition([11, 3, 52, 25, 21, 25, 6])
>>> print(sections_long['lower'])
[11, 3, 52]

half(data)¶

Splits an unsorted data set into two sorted data sets, each containing the same amount of elements (in sets with an odd amount of elements, the median is not included in either half)

Parameters

data (list of int or float) – List of numbers to analyze

Raises

TypeError – Argument must be a 1-dimensional list
TypeError – Elements of argument must be integers or floats

Returns

sections[‘upper’] (list of int or float) – List of all elements from the upper half of a sorted data set
sections[‘lower’] (list of int or float) – List of all elements from the lower half of a sorted data set

See also

sorted_list()

Notes

Set of numbers: \(a_i = \{ a_1, a_2, \cdots, a_n \}\)
Sorted version of set: \(A_i = ( A_1, A_2, \cdots, A_n )\)
- For all terms in \(A_i\): \(A_{n-1} \leq A_n\)
For sets with an odd amount of numbers:
- Lower section: \(A_{lower} = ( A_1, A_2, \cdots, A_{\lfloor n/2 \rfloor} )\)
- Upper section: \(A_{upper} = ( A_{\lceil n/2 \rceil}, A_{\lceil n/2 \rceil + 1}, \cdots, A_n )\)
For sets with an even amount of numbers:
- Lower section: \(A_{lower} = ( A_1, A_2, \cdots, A_{n/2} )\)
- Upper section: \(A_{upper} = ( A_{n/2 + 1}, A_{n/2 + 2}, \cdots, A_n )\)

Examples

Import half function from regressions library

>>> from regressions.statistics.halve import half

Determine the sorted upper half of the set [5, 2, 9, 8]

>>> sections_short = half([5, 2, 9, 8])
>>> print(sections_short['upper'])
[8, 9]

Determine the sorted lower half of the set [11, 3, 52, 25, 21, 25, 6]

>>> sections_long = half([11, 3, 52, 25, 21, 25, 6])
>>> print(sections_long['lower'])
[3, 6, 11]

half_dimension(data, dimension=1)¶

Splits an unsorted 2-dimensional data set into two sorted 2-dimensional data sets, each containing the same amount of elements, in which the sorting occurs based on the elements of the nested lists indicated by the dimension parameter (in sets with an odd amount of elements, the median is not included in either half)

Parameters

data (list of lists of int or float) – List of lists of numbers to analyze
dimension (int, default=1) – Number indicating by which element of the nested lists to sort

Raises

TypeError – First argument must be a 2-dimensional list
TypeError – Elements nested within first argument must be integers or floats
ValueError – Last argument must be a positive integer

Returns

sections[‘upper’] (list of lists of int or float) – List of all elements from the upper half of a data set, sorted according to the elements occupying a provided position
sections[‘lower’] (list of lists of int or float) – List of all elements from the lower half of a data set, sorted according to the elements occupying a provided position

Notes

Set of ordered pairs of numbers: \(a_i = \{ ( a_{1,1}, a_{1,2}, \cdots, a_{1,j}, a_{1,n} ), ( a_{2,1}, a_{2,2}, \cdots, a_{2,j}, a_{2,n} ), \cdots, \\ ( a_{m,1}, a_{m,2}, \cdots, a_{m,j}, a_{m,n} ) \}\)
Sorted version of set according to the values in the \(j\)th position: \(A_i = ( ( A_{1,1}, A_{1,2}, \cdots, A_{1,j}, A_{1,n} ), ( A_{2,1}, A_{2,2}, \cdots, A_{2,j}, A_{2,n} ), \cdots, \\ ( A_{m,1}, A_{m,2}, \cdots, A_{m,j}, A_{m,n} ) )\)
- For all terms in \(A_i\): \(A_{n-1,j} \leq A_{n,j}\)
For sets with an odd amount of ordered pairs:
- Lower section: \(A_{lower} = ( ( A_{1,1}, A_{1,2}, \cdots, A_{1,j}, A_{1,n} ), ( A_{2,1}, A_{2,2}, \cdots, A_{2,j}, A_{2,n} ), \cdots, \\ ( A_{\lfloor m/2 \rfloor,1}, A_{\lfloor m/2 \rfloor,2}, \cdots, A_{\lfloor m/2 \rfloor,j}, A_{\lfloor m/2 \rfloor,n} ) )\)
- Upper section: \(A_{upper} = ( ( A_{\lceil m/2 \rceil,1}, A_{\lceil m/2 \rceil,2}, \cdots, A_{\lceil m/2 \rceil,j}, A_{\lceil m/2 \rceil,n} ), ( A_{\lceil m/2 \rceil + 1,1}, A_{\lceil m/2 \rceil + 1,2}, \cdots, \\ A_{\lceil m/2 \rceil + 1,j}, A_{\lceil m/2 \rceil + 1,n} ), \cdots, ( A_{m,1}, A_{m,2}, \cdots, A_{m,j}, A_{m,n} ) )\)
For sets with an even amount of ordered pairs:
- Lower section: \(A_{lower} = ( ( A_{1,1}, A_{1,2}, \cdots, A_{1,j}, A_{1,n} ), ( A_{2,1}, A_{2,2}, \cdots, A_{2,j}, A_{2,n} ), \cdots, \\ ( A_{m/2,1}, A_{m/2,2}, \cdots, A_{m/2,j}, A_{m/2,n} ) )\)
- Upper section: \(A_{upper} = ( ( A_{m/2 + 1,1}, A_{m/2 + 1,2}, \cdots, A_{m/2 + 1,j}, A_{m/2 + 1,n} ), ( A_{m/2 + 2,1}, A_{m/2 + 2,2}, \cdots, \\ A_{m/2 + 2,j}, A_{m/2 + 2,n} ), \cdots, ( A_{m,1}, A_{m,2}, \cdots, A_{m,j}, A_{m,n} ) )\)

Examples

Import half_dimension function from regressions library

>>> from regressions.statistics.halve import half_dimension

Determine the upper half of the set [[3, 7, 1], [1, 8, 11], [6, 6, 6], [2, 15, 3], [10, 5, 9]] based on the second dimension

>>> sections_2d = half_dimension([[3, 7, 1], [1, 8, 11], [6, 6, 6], [2, 15, 3], [10, 5, 9]], 2)
>>> print(sections_2d['upper'])
[[1, 8, 11], [2, 15, 3]]

Determine the lower half of the set [[3, 7, 1], [1, 8, 11], [6, 6, 6], [2, 15, 3], [10, 5, 9]] based on the third dimension

>>> sections_3d = half_dimension([[3, 7, 1], [1, 8, 11], [6, 6, 6], [2, 15, 3], [10, 5, 9]], 3)
>>> print(sections_3d['lower'])
[[3, 7, 1], [2, 15, 3]]