score_nuisances with discrete treatment returns incorrect score

@Cantal00p

When using score_nuisances with a discrete treatment, the function does not return the correct score.

The issue comes from the inverse_onehot function in econml/utilities.py. Currently, when it receives as input a DataFrame generated by pandas.get_dummies(), it incorrectly decodes the treatment.

For example, in case of binary treatments, labels originally coded as 0 and 1 are shifted and end up being decoded as 1 and 2, due to the following implementation:

def inverse_onehot(T): """ Given a one-hot encoding of a value, return a vector reversing the encoding to get numeric treatment indices. Note that we assume that the first column has been removed from the input. """ assert ndim(T) == 2 # note that by default OneHotEncoder returns float64s, so need to convert to int return (T @ np.arange(1, T.shape[1] + 1)).astype(int)

This logic introduces an off-by-one error when decoding treatments.

Expected behavior

The function should return zero-based indices, ensuring that discrete treatments (e.g. 0/1) remain consistent after decoding. A corrected implementation would have the following code:

def inverse_onehot(T): assert econml.utilities.ndim(T) == 2 indices = ( np.arange(0, T.shape[1]) if isinstance(T, pd.DataFrame) else np.arange(1, T.shape[1] + 1) ) return (T @ indices).astype(int)

This change guarantees that score_nuisances computes the correct score for discrete treatments.

Contributed by @Cantal00p, @f5ilverio

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

score_nuisances with discrete treatment returns incorrect score #1006

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

score_nuisances with discrete treatment returns incorrect score #1006

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions