{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "c5233c68",
   "metadata": {},
   "source": [
    "# Proposed Debiasing Method for Inferring High-Dimensional Linear Models (Example Code via ``Debias-Infer``)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "182569fb",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-09-05T04:52:09.618389Z",
     "start_time": "2023-09-05T04:52:07.973125Z"
    }
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "from sklearn.preprocessing import PolynomialFeatures\n",
    "import scipy.stats\n",
    "from sklearn.linear_model import LogisticRegressionCV\n",
    "from sklearn.neural_network import MLPClassifier\n",
    "from sklearn.calibration import CalibratedClassifierCV\n",
    "\n",
    "from debiasing.debias_prog import ScaledLasso, DebiasProg, DualObj, DualCD, DebiasProgCV"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "76583f3b",
   "metadata": {},
   "source": [
    "## Simulate Random Sample\n",
    "\n",
    "We generate the i.i.d. data $\\{(Y_i,R_i,X_i)\\}_{i=1}^n \\subset \\mathbb{R}\\times \\{0,1\\} \\times\\mathbb{R}^d$ from the following linear model \n",
    "$$Y_i=X_i^T \\beta_0 + \\epsilon_i \\quad \\text{ with } \\quad X_i \\perp\\!\\!\\!\\perp \\epsilon_i, \\quad Y_i \\perp\\!\\!\\!\\perp R_i|X_i, \\quad X_i \\sim \\mathcal{N}_d(\\mathbf{0},\\Sigma), \\quad \\text{ and } \\quad \\epsilon_i \\sim \\mathcal{N}(0,1),$$\n",
    "where $d=1000$ and $n=900$.\n",
    "\n",
    "We adopt the circulant symmetric matrix $\\Sigma^{\\mathrm{cs}}$ in Javanmard and Montanari (2014) that is defined as $\\Sigma_{jj}=1$, $\\Sigma_{jk}=0.1 \\text{ when } j+1\\leq k\\leq j+5 \\text{ or } j+d-5\\leq k \\leq j+d-1$ with $\\Sigma_{jk}=0$ elsewhere for $j\\leq k$, and $\\Sigma_{jk}=\\Sigma_{kj}$.\n",
    "\n",
    "The true regression coefficient is $\\beta_0^{sp}=\\left(\\underbrace{\\sqrt{5},...,\\sqrt{5}}_5,0,...,0\\right)^T \\in \\mathbb{R}^d$, and the query point is $x=\\left(1,\\frac{1}{2},\\frac{1}{4},0,0,0,\\frac{1}{2},\\frac{1}{8},0,...,0\\right)^T\\in\\mathbb{R}^d$ to infer the joint effects of a few components of $\\beta_0$.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "247a292b",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-09-05T04:52:10.411191Z",
     "start_time": "2023-09-05T04:52:09.621748Z"
    }
   },
   "outputs": [],
   "source": [
    "d = 1000\n",
    "n = 900\n",
    "\n",
    "# Circulant symmetric covariance\n",
    "Sigma = np.zeros((d,d)) + np.eye(d)\n",
    "rho = 0.1\n",
    "for i in range(d):\n",
    "    for j in range(i+1, d):\n",
    "        if (j < i+6) or (j > i+d-6):\n",
    "            Sigma[i,j] = rho\n",
    "            Sigma[j,i] = rho\n",
    "sig = 1\n",
    "\n",
    "# True regression coefficient\n",
    "s_beta = 5\n",
    "beta_0 = np.zeros((d,))\n",
    "beta_0[:s_beta] = np.sqrt(5)\n",
    "\n",
    "X_sim = np.random.multivariate_normal(mean=np.zeros(d), cov=Sigma, size=n)\n",
    "eps_err_sim = sig*np.random.randn(n)\n",
    "Y_sim = np.dot(X_sim, beta_0) + eps_err_sim"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "3f60924f",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-09-05T04:52:10.418099Z",
     "start_time": "2023-09-05T04:52:10.414153Z"
    }
   },
   "outputs": [],
   "source": [
    "# Query point\n",
    "x = np.zeros((d,))\n",
    "x[0] = 1\n",
    "x[1] = 1/2\n",
    "x[2] = 1/4\n",
    "x[6] = 1/2\n",
    "x[7] = 1/8\n",
    "\n",
    "# True regression function\n",
    "m_true = np.dot(x, beta_0)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e32bb7ec",
   "metadata": {},
   "source": [
    "To increase the complexity of estimating the propensity score, we generate the missing indicators $R_i,i=1,...,n$ for the outcome variables $Y_i,i=1,...,n$ as:\n",
    "$$\\mathrm{P}(R_i=1|X_i) = \\Phi\\left(-4+\\sum_{k=1}^K Z_{ik} \\right),$$\n",
    "where $\\Phi(\\cdot)$ is the CDF of $\\mathcal{N}(0,1)$ and the vector $(Z_{i1},...,Z_{iK})$ contains all polynomial combinations of the first eight components $X_{i1},...,X_{i8}$ of the covariate vector $X_i$ with degrees less than or equal to two (_i.e._, including the linear, quadratic, and one-way interaction terms)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "4234ec77",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-09-05T04:52:10.439262Z",
     "start_time": "2023-09-05T04:52:10.421696Z"
    }
   },
   "outputs": [],
   "source": [
    "# MAR\n",
    "inter_mat = PolynomialFeatures(degree=2, interaction_only=False, \n",
    "                               include_bias=False).fit_transform(X_sim[:,:8])\n",
    "obs_prob = scipy.stats.norm.cdf(-4 + np.sum(inter_mat, axis=1))\n",
    "R = np.ones((n,))\n",
    "R[np.random.rand(n) >= obs_prob] = 0"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "34aab858",
   "metadata": {},
   "source": [
    "## Detailed Procedures of Applying Our Proposed Debiasing Method\n",
    "### 1. Lasso pilot estimate:\n",
    "\n",
    "To select the regularization parameter $\\lambda>0$ in a data adaptive way, we adopt the scaled Lasso (Sun and Zhang, 2012) with its universal regularization parameter $\\lambda_0=\\sqrt{\\frac{2\\log d}{n}}$ as the initialization. Specifically, it provides an iterative algorithm to obtain the Lasso estimate $\\hat{\\beta}$ and a consistent estimator $\\hat{\\sigma}_{\\epsilon}$ of the noise level $\\sigma_{\\epsilon}$ from the following jointly convex optimization problem:\n",
    "$$\\left(\\hat{\\beta}(\\tilde{\\lambda}), \\hat{\\sigma}_{\\epsilon}(\\tilde{\\lambda})\\right) = \\arg\\min_{\\beta\\in \\mathbb{R}^d, \\sigma_{\\epsilon} >0} \\left[\\frac{1}{2n\\sigma_{\\epsilon}} \\sum_{i=1}^n R_i\\left(Y_i - X_i^T\\beta\\right)^2 + \\frac{\\sigma_{\\epsilon}}{2}+\\tilde{\\lambda}\\|\\beta\\|_1\\right],$$\n",
    "where the regularization parameter $\\tilde{\\lambda} >0$ is updated iteratively as well."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "d59fc2e9",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-09-05T04:52:10.625477Z",
     "start_time": "2023-09-05T04:52:10.448482Z"
    }
   },
   "outputs": [],
   "source": [
    "# Lasso pilot estimator\n",
    "beta_pilot, sigma_hat = ScaledLasso(X=X_sim[R == 1,:], Y=Y_sim[R == 1], lam0='univ')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "50eecfff",
   "metadata": {},
   "source": [
    "### 2. Propensity Score Estimation:\n",
    "\n",
    "We consider both the oracle/true propensity scores and the estimated propensity scores by the Lasso-type logistic regression, two-layer neural network (with and without the Platt's logistic calibration (Platt, 1999))."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "502d2903",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-09-05T04:53:11.445322Z",
     "start_time": "2023-09-05T04:52:10.635035Z"
    }
   },
   "outputs": [],
   "source": [
    "## Propensity score estimation (Oracle, LR, MLP neural network)\n",
    "for non_met in ['Oracle', 'LR', 'NN', 'NNcal']:\n",
    "    if non_met == 'Oracle':\n",
    "        prop_score1 = obs_prob.copy()\n",
    "        MAE_prop = np.mean(abs(prop_score1 - obs_prob))\n",
    "        \n",
    "    if non_met == 'LR':\n",
    "        zeta2 = np.logspace(-1, np.log10(300), 40)*np.sqrt(np.log(d)/n)\n",
    "        lr2 = LogisticRegressionCV(Cs=1/zeta2, cv=5, penalty='l1', scoring='neg_log_loss', \n",
    "                                   solver='liblinear', tol=1e-6, max_iter=10000).fit(X_sim, R)\n",
    "        prop_score2 = lr2.predict_proba(X_sim)[:,1]\n",
    "        MAE_prop = np.mean(abs(prop_score2 - obs_prob))\n",
    "        \n",
    "    if non_met == 'NN':\n",
    "        lr2_NN = MLPClassifier(hidden_layer_sizes=(80,50,), activation='relu', \n",
    "                               random_state=None, learning_rate='adaptive', \n",
    "                               learning_rate_init=0.001, max_iter=1000).fit(X_sim, R)\n",
    "        prop_score3 = lr2_NN.predict_proba(X_sim)[:,1]\n",
    "        MAE_prop = np.mean(abs(prop_score3 - obs_prob))\n",
    "        \n",
    "    if non_met == 'NNcal':\n",
    "        NN_base = MLPClassifier(hidden_layer_sizes=(80,50,), activation='relu', random_state=None, \n",
    "                                learning_rate='adaptive', learning_rate_init=0.001, max_iter=1000)\n",
    "        lr2_NN = CalibratedClassifierCV(NN_base, method='sigmoid', cv=5).fit(X_sim, R)\n",
    "        prop_score4 = lr2_NN.predict_proba(X_sim)[:,1]\n",
    "        MAE_prop = np.mean(abs(prop_score4 - obs_prob))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2d7de6f1",
   "metadata": {},
   "source": [
    "### 3. Solving Our Debiasing Program\n",
    "\n",
    "We solve for the weights through our debiasing program, where the tuning parameter $\\frac{\\gamma}{n}>0$ is selected via 5-fold cross-validations."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "1c393741",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-09-05T05:01:47.285487Z",
     "start_time": "2023-09-05T04:53:11.449227Z"
    },
    "scrolled": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The primal debiasing program is infeasible!\n",
      "The primal debiasing program for this fold of the data is not feasible when \\gamma/n=0.001!\n",
      "The primal debiasing program is infeasible!\n",
      "The primal debiasing program for this fold of the data is not feasible when \\gamma/n=0.026!\n",
      "The primal debiasing program is infeasible!\n",
      "The primal debiasing program for this fold of the data is not feasible when \\gamma/n=0.001!\n",
      "The primal debiasing program is infeasible!\n",
      "The primal debiasing program for this fold of the data is not feasible when \\gamma/n=0.026!\n",
      "The primal debiasing program is infeasible!\n",
      "The primal debiasing program for this fold of the data is not feasible when \\gamma/n=0.001!\n",
      "The primal debiasing program is infeasible!\n",
      "The primal debiasing program for this fold of the data is not feasible when \\gamma/n=0.026!\n",
      "The primal debiasing program is infeasible!\n",
      "The primal debiasing program for this fold of the data is not feasible when \\gamma/n=0.001!\n",
      "The primal debiasing program is infeasible!\n",
      "The primal debiasing program for this fold of the data is not feasible when \\gamma/n=0.026!\n",
      "The primal debiasing program is infeasible!\n",
      "The primal debiasing program for this fold of the data is not feasible when \\gamma/n=0.001!\n",
      "The primal debiasing program is infeasible!\n",
      "The primal debiasing program for this fold of the data is not feasible when \\gamma/n=0.026!\n"
     ]
    }
   ],
   "source": [
    "## Oracle\n",
    "w_1se1, ll_1se1, gamma_n_1se1 = DebiasProgCV(X=X_sim.copy(), x=x, prop_score=prop_score1, gamma_lst=None, \n",
    "                                             cv_fold=5, cv_rule='1se')\n",
    "# w_minfeas1, ll_minfeas1, gamma_n_minfeas1 = DebiasProgCV(X=X_sim.copy(), x=x, prop_score=prop_score1, \n",
    "#                                                          gamma_lst=None, cv_fold=5, cv_rule='minfeas')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "907bf3f4",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-09-05T05:09:39.372875Z",
     "start_time": "2023-09-05T05:01:47.287520Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The primal debiasing program is infeasible!\n",
      "The primal debiasing program for this fold of the data is not feasible when \\gamma/n=0.001!\n",
      "The primal debiasing program is infeasible!\n",
      "The primal debiasing program for this fold of the data is not feasible when \\gamma/n=0.026!\n",
      "The primal debiasing program is infeasible!\n",
      "The primal debiasing program for this fold of the data is not feasible when \\gamma/n=0.001!\n",
      "The primal debiasing program is infeasible!\n",
      "The primal debiasing program for this fold of the data is not feasible when \\gamma/n=0.026!\n",
      "The primal debiasing program is infeasible!\n",
      "The primal debiasing program for this fold of the data is not feasible when \\gamma/n=0.001!\n",
      "The primal debiasing program is infeasible!\n",
      "The primal debiasing program for this fold of the data is not feasible when \\gamma/n=0.026!\n",
      "The primal debiasing program is infeasible!\n",
      "The primal debiasing program for this fold of the data is not feasible when \\gamma/n=0.001!\n",
      "The primal debiasing program is infeasible!\n",
      "The primal debiasing program for this fold of the data is not feasible when \\gamma/n=0.026!\n",
      "The primal debiasing program is infeasible!\n",
      "The primal debiasing program for this fold of the data is not feasible when \\gamma/n=0.001!\n",
      "The primal debiasing program is infeasible!\n",
      "The primal debiasing program for this fold of the data is not feasible when \\gamma/n=0.026!\n"
     ]
    }
   ],
   "source": [
    "## LR\n",
    "w_1se2, ll_1se2, gamma_n_1se2 = DebiasProgCV(X=X_sim.copy(), x=x, prop_score=prop_score2, gamma_lst=None, \n",
    "                                             cv_fold=5, cv_rule='1se')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "c429d227",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-09-05T05:21:17.341091Z",
     "start_time": "2023-09-05T05:09:39.375884Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The primal debiasing program is infeasible!\n",
      "The primal debiasing program for this fold of the data is not feasible when \\gamma/n=0.001!\n",
      "The primal debiasing program is infeasible!\n",
      "The primal debiasing program for this fold of the data is not feasible when \\gamma/n=0.026!\n",
      "The primal debiasing program is infeasible!\n",
      "The primal debiasing program for this fold of the data is not feasible when \\gamma/n=0.001!\n",
      "The primal debiasing program is infeasible!\n",
      "The primal debiasing program for this fold of the data is not feasible when \\gamma/n=0.026!\n",
      "The primal debiasing program is infeasible!\n",
      "The primal debiasing program for this fold of the data is not feasible when \\gamma/n=0.001!\n",
      "The primal debiasing program is infeasible!\n",
      "The primal debiasing program for this fold of the data is not feasible when \\gamma/n=0.026!\n",
      "The coordinate descent algorithm has reached its maximum number of iterations: 5000! Reiterate one more times without small perturbations to the scaled design matrix...\n",
      "The strong duality between primal and dual programs does not satisfy when \\gamma/n=0.051!\n",
      "The primal debiasing program is infeasible!\n",
      "The primal debiasing program for this fold of the data is not feasible when \\gamma/n=0.001!\n",
      "The primal debiasing program is infeasible!\n",
      "The primal debiasing program for this fold of the data is not feasible when \\gamma/n=0.026!\n",
      "The primal debiasing program is infeasible!\n",
      "The primal debiasing program for this fold of the data is not feasible when \\gamma/n=0.001!\n",
      "The primal debiasing program is infeasible!\n",
      "The primal debiasing program for this fold of the data is not feasible when \\gamma/n=0.026!\n"
     ]
    }
   ],
   "source": [
    "## NN\n",
    "w_1se3, ll_1se3, gamma_n_1se3 = DebiasProgCV(X=X_sim.copy(), x=x, prop_score=prop_score3, gamma_lst=None, \n",
    "                                             cv_fold=5, cv_rule='1se')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "7fef6558",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-09-05T05:29:11.466656Z",
     "start_time": "2023-09-05T05:21:17.344207Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The primal debiasing program is infeasible!\n",
      "The primal debiasing program for this fold of the data is not feasible when \\gamma/n=0.001!\n",
      "The primal debiasing program is infeasible!\n",
      "The primal debiasing program for this fold of the data is not feasible when \\gamma/n=0.026!\n",
      "The primal debiasing program is infeasible!\n",
      "The primal debiasing program for this fold of the data is not feasible when \\gamma/n=0.001!\n",
      "The primal debiasing program is infeasible!\n",
      "The primal debiasing program for this fold of the data is not feasible when \\gamma/n=0.026!\n",
      "The primal debiasing program is infeasible!\n",
      "The primal debiasing program for this fold of the data is not feasible when \\gamma/n=0.001!\n",
      "The primal debiasing program is infeasible!\n",
      "The primal debiasing program for this fold of the data is not feasible when \\gamma/n=0.026!\n",
      "The primal debiasing program is infeasible!\n",
      "The primal debiasing program for this fold of the data is not feasible when \\gamma/n=0.001!\n",
      "The primal debiasing program is infeasible!\n",
      "The primal debiasing program for this fold of the data is not feasible when \\gamma/n=0.026!\n",
      "The primal debiasing program is infeasible!\n",
      "The primal debiasing program for this fold of the data is not feasible when \\gamma/n=0.001!\n",
      "The primal debiasing program is infeasible!\n",
      "The primal debiasing program for this fold of the data is not feasible when \\gamma/n=0.026!\n"
     ]
    }
   ],
   "source": [
    "## NNcal\n",
    "w_1se4, ll_1se4, gamma_n_1se4 = DebiasProgCV(X=X_sim.copy(), x=x, prop_score=prop_score4, gamma_lst=None, \n",
    "                                             cv_fold=5, cv_rule='1se')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e3b2da26",
   "metadata": {},
   "source": [
    "If the tuning parameter $\\frac{\\gamma}{n}>0$ is pre-selected, then we can run the primal/dual debiasing program without cross-validations."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "08db4ad7",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-09-05T05:29:15.135929Z",
     "start_time": "2023-09-05T05:29:11.468898Z"
    }
   },
   "outputs": [],
   "source": [
    "w_1se1_new = DebiasProg(X=X_sim.copy(), x=x, Pi=np.diag(prop_score1), gamma_n=gamma_n_1se1)\n",
    "ll_1se1_new = DualCD(X=X_sim.copy(), x=x, Pi=np.diag(prop_score1), gamma_n=gamma_n_1se1, ll_init=None, \n",
    "                 eps=1e-9, max_iter=5000)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "123735a1",
   "metadata": {},
   "source": [
    "### 4. Construct the Debiased Estimator and its Asymptotic 95% Confidence Interval"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "80f3aa97",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-09-05T05:41:54.721344Z",
     "start_time": "2023-09-05T05:41:54.680104Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The true regression function is 3.913118960624632.\n",
      "\n",
      "The 95% confidence interval of our debiasing method under the oracle propensity scores is [3.8227718343812698, 3.9852952726087865].\n",
      "\n",
      "The 95% confidence interval of our debiasing method when the oracle propensity scores are estimated by the Lasso-type logistic regression is \n",
      "[3.8531350930448305, 4.029273820154903].\n",
      "\n",
      "The 95% confidence interval of our debiasing method when the oracle propensity scores are estimated by the two-layer neural network is \n",
      "[3.824591171878295, 3.9872310918920717].\n",
      "\n",
      "The 95% confidence interval of our debiasing method when the oracle propensity scores are estimated by the two-layer neural network with Platt's logistic calibration is \n",
      "[3.8490969609870653, 4.02376413689852].\n",
      "\n"
     ]
    }
   ],
   "source": [
    "print('The true regression function is '+str(m_true)+'.\\n')\n",
    "\n",
    "## Oracle\n",
    "m_deb1 = np.dot(x, beta_pilot) + np.sum(w_1se1 * R * (Y_sim - np.dot(X_sim, beta_pilot)))/np.sqrt(n)\n",
    "asym_sd1 = np.sqrt(np.sum(prop_score1 * w_1se1**2)/n)\n",
    "print('The 95% confidence interval of our debiasing method under the oracle propensity scores is '\\\n",
    "      +str([m_deb1 - asym_sd1*sigma_hat*scipy.stats.norm.ppf(1-0.05/2), \n",
    "            m_deb1 + asym_sd1*sigma_hat*scipy.stats.norm.ppf(1-0.05/2)])+'.\\n')\n",
    "\n",
    "## LR\n",
    "m_deb2 = np.dot(x, beta_pilot) + np.sum(w_1se2 * R * (Y_sim - np.dot(X_sim, beta_pilot)))/np.sqrt(n)\n",
    "asym_sd2 = np.sqrt(np.sum(prop_score2 * w_1se2**2)/n)\n",
    "print('The 95% confidence interval of our debiasing method when the oracle propensity scores are estimated by '\\\n",
    "      'the Lasso-type logistic regression is \\n'\\\n",
    "      +str([m_deb2 - asym_sd2*sigma_hat*scipy.stats.norm.ppf(1-0.05/2), \n",
    "            m_deb2 + asym_sd2*sigma_hat*scipy.stats.norm.ppf(1-0.05/2)])+'.\\n')\n",
    "\n",
    "## NN\n",
    "m_deb3 = np.dot(x, beta_pilot) + np.sum(w_1se3 * R * (Y_sim - np.dot(X_sim, beta_pilot)))/np.sqrt(n)\n",
    "asym_sd3 = np.sqrt(np.sum(prop_score3 * w_1se3**2)/n)\n",
    "print('The 95% confidence interval of our debiasing method when the oracle propensity scores are estimated by '\\\n",
    "      'the two-layer neural network is \\n'\\\n",
    "      +str([m_deb3 - asym_sd3*sigma_hat*scipy.stats.norm.ppf(1-0.05/2), \n",
    "            m_deb3 + asym_sd3*sigma_hat*scipy.stats.norm.ppf(1-0.05/2)])+'.\\n')\n",
    "\n",
    "## NNcal\n",
    "m_deb4 = np.dot(x, beta_pilot) + np.sum(w_1se4 * R * (Y_sim - np.dot(X_sim, beta_pilot)))/np.sqrt(n)\n",
    "asym_sd4 = np.sqrt(np.sum(prop_score4 * w_1se4**2)/n)\n",
    "print(\"The 95% confidence interval of our debiasing method when the oracle propensity scores are estimated by \"\\\n",
    "      \"the two-layer neural network with Platt's logistic calibration is \\n\"\\\n",
    "      +str([m_deb4 - asym_sd4*sigma_hat*scipy.stats.norm.ppf(1-0.05/2), \n",
    "            m_deb4 + asym_sd4*sigma_hat*scipy.stats.norm.ppf(1-0.05/2)])+'.\\n')"
   ]
  }
 ],
 "metadata": {
  "hide_input": false,
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}