ValueError: endog must be in the unit interval

While using statsmodels, I am getting this weird error: ValueError: endog must be in the unit interval. Can someone give me more information on this error? Google is not helping.

Code that produced the error:

"""
Multiple regression with dummy variables.
"""
import pandas as pd
import statsmodels.api as sm
import pylab as pl
import numpy as np
df = pd.read_csv('cost_data.csv')
df.columns = ['Cost', 'R(t)', 'Day of Week']
dummy_ranks = pd.get_dummies(df['Day of Week'], prefix='days')
cols_to_keep = ['Cost', 'R(t)']
data = df[cols_to_keep].join(dummy_ranks.ix[:,'days_2':])
data['intercept'] = 1.0
print(data)
train_cols = data.columns[1:]
logit = sm.Logit(data['Cost'], data[train_cols])
result = logit.fit()
print(result.summary())

And the traceback:

Traceback (most recent call last): File "multiple_regression_dummy.py", line 20, in <module> logit = sm.Logit(data['Cost'], data[train_cols]) File "/Library/Frameworks/", line 404, in __init__ raise ValueError("endog must be in the unit interval.")
ValueError: endog must be in the unit interval.
4

3 Answers

I got this error when my target column had values larger than 1. Make sure your target column is between 0 and 1 (as is required for a Logistic Regression) and try again. For example, if you have target column with values 1-5, make 4 and 5 the positive class and 1,2,3 the negative class. Hope this helps.

1

It seems like you followed the same logistic regression tutorial that I did:

I ended up getting the same Value Error when I fit my logistic regression, and the trick I needed to get it running was making sure to drop all rows of my data with missing values (N/A or np.nan).

This can be done with the pandas function pandas.notnull() as follows :

data = data[pd.notnull(data['Cost'])],
data = data[pd.notnull(data['R(t)'])],
...

and so on until all your variables have the same amount of values to work with.

Hope this helps someone else!

I had the same problem: I change the model from a Classification to a Regression one (I was using a Classification Model .logit in a Regression problem)

You can still use StatsModel, but with OLS, for example, instead of logit. Logit (Logistic Regression) is for Classification problems, but here it seems it is a Regression one. Using OLS, could solve the problem

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge that you have read and understand our privacy policy and code of conduct.

You Might Also Like