Finding coefficients for logistic regression

I'm working on a classification problem and need the coefficients of the logistic regression equation. I can find the coefficients in R but I need to submit the project in python. How to get the coefficient values in scikit-learn?

1

9 Answers

sklearn.linear_model.LogisticRegression is for you. See this example:

from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris
X, y = load_iris(return_X_y=True)
clf = LogisticRegression(random_state=0).fit(X, y)
print(clf.coef_, clf.intercept_)

The statsmodels library would give you a breakdown of the coefficient results, as well as the associated p-values to determine their significance.

Using an example of x1 and y1 variables:

x1_train, x1_test, y1_train, y1_test = train_test_split(x1, y1, random_state=0)
logreg = LogisticRegression().fit(x1_train,y1_train)
logreg
print("Training set score: {:.3f}".format(logreg.score(x1_train,y1_train)))
print("Test set score: {:.3f}".format(logreg.score(x1_test,y1_test)))
import statsmodels.api as sm
logit_model=sm.Logit(y1,x1)
result=logit_model.fit()
print(result.summary())

Example results:

Optimization terminated successfully. Current function value: 0.596755 Iterations 7 Logit Regression Results
==============================================================================
Dep. Variable: IsCanceled No. Observations: 20000
Model: Logit Df Residuals: 19996
Method: MLE Df Model: 3
Date: Sat, 17 Aug 2019 Pseudo R-squ.: 0.1391
Time: 23:58:55 Log-Likelihood: -11935.
converged: True LL-Null: -13863. LLR p-value: 0.000
============================================================================== coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
const -2.1417 0.050 -43.216 0.000 -2.239 -2.045
x1 0.0055 0.000 32.013 0.000 0.005 0.006
x2 0.0236 0.001 36.465 0.000 0.022 0.025
x3 2.1137 0.104 20.400 0.000 1.911 2.317
==============================================================================

Provided that your X is a Pandas DataFrame and clf is your Logistic Regression Model you can get the name of the feature as well as its value with this line of code:

pd.DataFrame(zip(X_train.columns, np.transpose(clf.coef_)), columns=['features', 'coef']) 

Have a look at the statsmodels library's Logit model.

You would use it like this:

from statsmodels.discrete.discrete_model import Logit
from statsmodels.tools import add_constant
x = [...] # Obesrvations
y = [...] # Response variable
x = add_constant(x)
print(Logit(y, x).fit().summary())

Luffy, please remember to always share your code and your attempts so we can know what you tried and help you out. Regardless of that, I think you are looking for this:

import numpy as np
from sklearn.linear_model import LogisticRegression
X = np.array([[1, 1], [1, 2], [2, 2], [2, 3]]) #Your x values, for a 2 variable model.
#y = 1 * x_0 + 2 * x_1 + 3 #This is the "true" model
y = np.dot(X, np.array([1, 2])) + 3 #Generating the true y-values
reg = LogisticRegression().fit(X, y) #Fitting the model given your X and y values.
reg.coef_ #Prints an array of all regressor values (b1 and b2, or as many bs as your model has)
reg.intercept_ #Prints value for intercept/b0
reg.predict(np.array([[3, 5]])) #Predicts an array of y-values with the fitted model given the inputs
0

With a few more details and showing how to replace the final layer of a pytorch model:

#%%
"""
Get the weights & biases to set them to a nn.Linear layer in pytorch
"""
import numpy as np
import torch
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from torch import nn
X, y = load_iris(return_X_y=True)
print(f'{X.shape=}')
print(f'{y.shape=}')
Din: int = X.shape[1]
total_data_set_size: int = X.shape[0]
assert y.shape[0] == total_data_set_size
clf = LogisticRegression(random_state=0).fit(X, y)
out = clf.predict(X[:2, :])
# print(f'{out=}')
out = clf.predict_proba(X[:2, :])
print(f'{out=}')
clf.score(X, y)
# - coef_ndarray of shape (1, n_features) or (n_classes, n_features)
print(f'{clf.coef_.shape=}')
print(f'{clf.intercept_.shape=}')
assert (clf.coef_.shape[1] == Din)
Dout: int = clf.coef_.shape[0]
print(f'{Dout=} which is the number of classes too in classification')
assert (Dout == clf.intercept_.shape[0])
print()
num_classes: int = Dout
mdl = nn.Linear(in_features=Din, out_features=num_classes)
mdl.weight = torch.nn.Parameter(torch.from_numpy(clf.coef_))
mdl.bias = torch.nn.Parameter(torch.from_numpy(clf.intercept_))
out2 = torch.softmax(mdl(torch.from_numpy(X[:2, :])), dim=1)
print(f'{out2=}')
assert np.isclose(out2.detach().cpu().numpy(), out).all()
# -
# module: nn.Module = getattr(base_model, layer_to_replace)
# num_classes: int = clf.coef_[0] # out_features=Dout
# num_features: int = clf.coef_[1] # in_features
# assert module.weight.Size() == torch.Size([num_features, num_classes])
# assert module.bias.Size() == torch.Size([num_classes])
# module.weight = torch.nn.Parameter(torch.from_numpy(clf.coef_))
# module.bias = torch.nn.Parameter(torch.from_numpy(clf.intercept_))
1

a little correction last answer:

pd.DataFrame(zip(X_train.columns, np.transpose(clf.coef_.tolist()[0])), columns=['features', 'coef'])
from sklearn.linear_model import LogisticRegression
model = LogisticRegression(solver='liblinear', random_state=10)
mdle = model.fit(X_train, Y_train)
print(mdle.classes_)
print("model intercept :::" + str(format(model.intercept_[0], '.5f')))
print("model coeffieient :::" + str(format(model.coef_[0][0], '.5f'))) 

If you want to map coefficient names to their values you can use

def logreg_to_dict(clf: LogisticRegression, feature_names: list[str]) -> dict[str, float]: coefs = np.concatenate([clf.intercept_, clf.coef_.squeeze()]) return dict(zip(["intercept"] + feature_names, coefs))

feature_names is a list of features the model was trained on.

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge that you have read and understand our privacy policy and code of conduct.

You Might Also Like