ステップワイズ回帰分析

ホーム > 製品 > nAG数値計算ライブラリ > サンプルソースコード集 > ステップワイズ回帰分析 (C言語/C++)

Keyword: ステップワイズ, 変数選択, 線形回帰

概要

本サンプルはステップワイズ回帰分析を行うC言語によるサンプルプログラムです。本サンプルは以下に示される13個の観測値とそれぞれ4個の説明変数についてステップワイズ変数選択による回帰分析を行い、フィッティングされた回帰モデルの切片の推定値、選択された説明変数の偏回帰係数の推定値と標準誤差、そして残差の平均平方を出力します。

ステップワイズ回帰の統計量の計算のデータ　

※本サンプルはnAG Cライブラリに含まれる関数 nag_full_step_regsn() のExampleコードです。本サンプル及び関数の詳細情報は nag_full_step_regsn のマニュアルページをご参照ください。
ご相談やお問い合わせはこちらまで

入力データ

（本関数の詳細はnag_full_step_regsn のマニュアルページを参照）

このデータをダウンロード

nag_full_step_regsn (g02efc) Example Program Data
13 4 4.0 2.0 1.0e-6 1        : N,M,FIN,FOUT,TAU,MONLEV
 7.0 26.0  6.0 60.0  78.5
 1.0 29.0 15.0 52.0  74.3
11.0 56.0  8.0 20.0 104.3
11.0 31.0  8.0 47.0  87.6
 7.0 52.0  6.0 33.0  95.9
11.0 55.0  9.0 22.0 109.2
 3.0 71.0 17.0  6.0 102.7
 1.0 31.0 22.0 44.0  72.5
 2.0 54.0 18.0 22.0  93.1
21.0 47.0  4.0 26.0 115.9
 1.0 40.0 23.0 34.0  83.8
11.0 66.0  9.0 12.0 113.3
10.0 68.0  8.0 12.0 109.4    : End of X array of size N by M+1
1 1 1 1                      : Array ISX

１行目はタイトル行で読み飛ばされます。
２行目に観測値の数(n)、説明変数の数(m)、説明変数がモデルに投入される場合超える必要のある分散比の基準値(fin)、この値を下回る場合に説明変数がモデルから除去される分散比の基準値(fout)、そして許容誤差(tau)を指定しています。
３から１５行目には左から縦４列目までに説明変数を、一番右側の列に観測値(x)を指定しています。
１６行目には説明変数がステップワイズ変数選択を実行するのに使用されるかどうか(isx)を指定しています。

出力結果

（本関数の詳細はnag_full_step_regsn のマニュアルページを参照）

この出力例をダウンロード

nag_full_step_regsn (g02efc) Example Program Results

 Starting Stepwise Selection
 
 Forward Selection
 Variable    1 Variance ratio =    1.260E+01
 Variable    2 Variance ratio =    2.196E+01
 Variable    3 Variance ratio =    4.403E+00
 Variable    4 Variance ratio =    2.280E+01
 
 Adding variable    4 to model
 
 Backward Selection
 Variable    4 Variance ratio =    2.280E+01
 
 Keeping all current variables
 
 Forward Selection
 Variable    1 Variance ratio =    1.082E+02
 Variable    2 Variance ratio =    1.725E-01
 Variable    3 Variance ratio =    4.029E+01
 
 Adding variable    1 to model
 
 Backward Selection
 Variable    1 Variance ratio =    1.082E+02
 Variable    4 Variance ratio =    1.593E+02
 
 Keeping all current variables
 
 Forward Selection
 Variable    2 Variance ratio =    5.026E+00
 Variable    3 Variance ratio =    4.236E+00
 
 Adding variable    2 to model
 
 Backward Selection
 Variable    1 Variance ratio =    1.540E+02
 Variable    2 Variance ratio =    5.026E+00
 Variable    4 Variance ratio =    1.863E+00
 
 Dropping variable    4 from model
 
 Forward Selection
 Variable    3 Variance ratio =    1.832E+00
 Variable    4 Variance ratio =    1.863E+00
 
 Finished Stepwise Selection

Fitted Model Summary
Term          Estimate      Standard Error
Intercept:     5.258e+01        2.294e+00
Variable:   1  1.468e+00        1.213e-01
Variable:   2  6.623e-01        4.585e-02

RMS: 5.790e+00

６行目にフィッティングされた回帰モデルの切片の推定値と標準誤差が出力されています。
７～８行目にはステップワイズ変数選択により選択された説明変数１と説明変数２の偏回帰係数の最小二乗推定値と標準誤差が出力されています。
１０行目にフィッティングされた回帰モデルの残差の平均平方が出力されています。

ソースコード

（本関数の詳細はnag_full_step_regsn のマニュアルページを参照）

※本サンプルソースコードはnAG数値計算ライブラリ（Windows, Linux, MAC等に対応）の関数を呼び出します。
サンプルのコンパイル及び実行方法

このソースコードをダウンロード

/* nag_full_step_regsn (g02efc) Example Program.
 *
 * CLL6I261D/CLL6I261DL Version.
 *
 * Copyright 2017 Numerical Algorithms Group.
 *
 * Mark 26.1, 2017.
 */

#include <stdio.h>
#include <nag.h>
#include <nag_stdlib.h>
#include <nagg02.h>

int main(void)
{
  /* Scalars */
  double fin, fout, rms, rsq, sw, tau;
  Integer df, exit_status, i, j, m, n, pdx;

  /* Arrays */
  double *b = 0, *c = 0, *se = 0, *wmean = 0, *x = 0;
  Integer *isx = 0;

  /* Nag types */
  Nag_OrderType order;
  Nag_SumSquare mean;
  Nag_Comm comm;
  NagError fail;

#ifdef nAG_COLUMN_ORDER
#define X(I, J) x[(J-1)*pdx + I - 1]
  order = Nag_ColMajor;
#else
#define X(I, J) x[(I-1)*pdx + J - 1]
  order = Nag_RowMajor;
#endif

  INIT_FAIL(fail);

  exit_status = 0;

  printf("nag_full_step_regsn (g02efc) Example Program Results\n\n");

  /* Skip heading in data file */
  scanf("%*[^\n]");
  scanf("%ld %ld %lf %lf %lf", &n, &m, &fin, &fout, &tau);
  scanf("%*[^\n]");

  if (n > 1 && m > 1) {
    /* Allocate memory */
    if (!(b = nAG_ALLOC(m + 1, double)) ||
        !(c = nAG_ALLOC((m + 1) * (m + 2) / 2, double)) ||
        !(se = nAG_ALLOC(m + 1, double)) ||
        !(wmean = nAG_ALLOC(m + 1, double)) ||
        !(x = nAG_ALLOC(n * (m + 1), double)) ||
        !(isx = nAG_ALLOC(m, Integer)))
    {
      printf("Allocation failure\n");
      exit_status = -1;
      goto END;
    }
  }
  else {
    printf("Invalid n or m.\n");
    exit_status = 1;
    return exit_status;
  }

#ifdef nAG_COLUMN_ORDER
  pdx = n;
#else
  pdx = m + 1;
#endif

  for (i = 1; i <= n; ++i) {
    for (j = 1; j <= m + 1; ++j) {
      scanf("%lf", &X(i, j));
    }
  }
  scanf("%*[^\n]");

  for (j = 1; j <= m; ++j) {
    scanf("%ld", &isx[j - 1]);
  }
  scanf("%*[^\n]");

  /* nag_sum_sqs (g02buc).
   * Computes sums of squares and cross-products of deviations
   * from the mean for the augmented matrix
   */
  mean = Nag_AboutMean;
  nag_sum_sqs(order, mean, n, m + 1, x, pdx, 0, &sw, wmean, c, &fail);
  if (fail.code != NE_NOERROR) {
    printf("Error from nag_sum_sqs (g02buc).\n%s\n.", fail.message);
    exit_status = 1;
    goto END;
  }

  fflush(stdout);

  /* Perform stepwise selection of variables using
   * nag_full_step_regsn (g02efc):
   *   Stepwise linear regression.
   */
  nag_full_step_regsn(m, n, wmean, c, sw, isx, fin, fout, tau, b, se, &rsq,
                      &rms, &df, nag_full_step_regsn_monfun, &comm, &fail);
  if (fail.code != NE_NOERROR) {
    printf("Error from nag_full_step_regsn (g02efc).\n%s\n.", fail.message);
    exit_status = 1;
    goto END;
  }

  /* Display summary information for fitted model */
  printf("\n");
  printf("Fitted Model Summary\n");
  printf("%-10s   %-10s%19s\n", "Term", " Estimate", "Standard Error");
  printf("%-10s   %11.3e%17.3e\n", "Intercept:", b[0], se[0]);
  for (i = 1; i <= m; ++i) {
    j = isx[i - 1];
    if (j == 1 || j == 2) {
      printf("%-10s%3ld%11.3e%17.3e\n", "Variable:", i, b[i],
             se[i]);
    }
  }
  printf("\n");
  printf("RMS: %-12.3e\n\n", rms);

END:
  nAG_FREE(b);
  nAG_FREE(c);
  nAG_FREE(se);
  nAG_FREE(wmean);
  nAG_FREE(x);
  nAG_FREE(isx);

  return exit_status;
}

C言語によるサンプルソースコード : 使用関数名：nag_full_step_regsn (g02efc)

概要

入力データ

出力結果

ソースコード