baybe.utils.augmentation.df_apply_dependency_augmentation

baybe.utils.augmentation.df_apply_dependency_augmentation(df: DataFrame, causing: tuple[str, Sequence], affected: Collection[tuple[str, Sequence]])[source]

Augment a dataframe if dependency invariant columns are present.

This works with the concept of column-values pairs for causing and affected column. Any row present where the specified causing column has one of the provided values will trigger an augmentation on the affected columns. The latter are augmented by going through all their invariant values and adding respective new rows.

Parameters:
Return type:

DataFrame

Returns:

The augmented dataframe containing the original one. Augmented row indices are identical with the index of their original row.

Examples

>>> df = pd.DataFrame({'A':[0,1],'B':[2,3], 'C': [5, 5], 'D': [6, 7]})
>>> df
   A  B  C  D
0  0  2  5  6
1  1  3  5  7
>>> causing = ('A', [0])
>>> affected = [('B', [2,3,4])]
>>> dfa = df_apply_dependency_augmentation(df, causing, affected)
>>> dfa
   A  B  C  D
0  0  2  5  6
0  0  3  5  6
0  0  4  5  6
1  1  3  5  7
>>> causing = ('A', [0])
>>> affected = [('B', [2,3,4])]
>>> dfa = df_apply_dependency_augmentation(df, causing, affected)
>>> dfa
   A  B  C  D
0  0  2  5  6
0  0  3  5  6
0  0  4  5  6
1  1  3  5  7
>>> causing = ('A', [0, 1])
>>> affected = [('B', [2,3])]
>>> dfa = df_apply_dependency_augmentation(df, causing, affected)
>>> dfa
   A  B  C  D
0  0  2  5  6
0  0  3  5  6
1  1  2  5  7
1  1  3  5  7
>>> causing = ('A', [0])
>>> affected = [('B', [2,3]), ('C', [5, 6])]
>>> dfa = df_apply_dependency_augmentation(df, causing, affected)
>>> dfa
   A  B  C  D
0  0  2  5  6
0  0  2  6  6
0  0  3  5  6
0  0  3  6  6
1  1  3  5  7