特異値分解を用いた部分最小二乗（PLS）回帰

ホーム > 製品 > nAG数値計算ライブラリ > サンプルソースコード集 > 特異値分解を用いた部分最小二乗（PLS）回帰 (C言語/C++)

Keyword: 特異値分解, 部分最小二乗, PLS, 回帰

概要

本サンプルは特異値分解を用いた部分最小二乗（PLS）回帰の計算を行うC言語によるサンプルプログラムです。本サンプルは以下に示されるデータについて部分最小二乗（PLS）回帰の計算を行います。

Ridge回帰のデータ　

※本サンプルはnAG Cライブラリに含まれる関数 nag_pls_orth_scores_svd() のExampleコードです。本サンプル及び関数の詳細情報は nag_pls_orth_scores_svd のマニュアルページをご参照ください。
ご相談やお問い合わせはこちらまで

入力データ

（本関数の詳細はnag_pls_orth_scores_svd のマニュアルページを参照）

このデータをダウンロード

nag_pls_orth_scores_svd (g02lac) Example Program Data
15 15 1 Nag_PredStdScale 4 : n, mx, my, iscale, maxfac
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : isx
-2.6931 -2.5271 -1.2871  3.0777  0.3891 -0.0701  1.9607 -1.6324  0.5746
1.9607 -1.6324  0.574 2.8369  1.4092 -3.1398 0.00
-2.6931 -2.5271 -1.2871  3.0777  0.3891 -0.0701  1.9607 -1.6324  0.5746
0.0744 -1.7333  0.0902  2.8369  1.4092 -3.1398 0.28
-2.6931 -2.5271 -1.2871  3.0777  0.3891 -0.0701  0.0744 -1.7333  0.0902
1.9607 -1.6324  0.5746  2.8369  1.4092 -3.1398 0.20
-2.6931 -2.5271 -1.2871  3.0777  0.3891 -0.0701  0.0744 -1.7333  0.0902
0.0744 -1.7333  0.0902  2.8369  1.4092 -3.1398 0.51
-2.6931 -2.5271 -1.2871  2.8369  1.4092 -3.1398  0.0744 -1.7333  0.0902
0.0744 -1.7333  0.0902  2.8369  1.4092 -3.1398 0.11
-2.6931 -2.5271 -1.2871  3.0777  0.3891 -0.0701 -4.7548  3.6521  0.8524
0.0744 -1.7333  0.0902  2.8369  1.4092 -3.1398 2.73
-2.6931 -2.5271 -1.2871  3.0777  0.3891 -0.0701  0.0744 -1.7333  0.0902
0.0744 -1.7333  0.0902 -1.2201  0.8829  2.2253 0.18
-2.6931 -2.5271 -1.2871  3.0777  0.3891 -0.0701  2.4064  1.7438  1.1057
0.0744 -1.7333  0.0902  2.8369  1.4092 -3.1398 1.53
-2.6931 -2.5271 -1.2871  0.0744 -1.7333  0.0902  0.0744 -1.7333  0.0902
0.0744 -1.7333  0.0902  2.8369  1.4092 -3.1398 -0.10
 2.2261 -5.3648  0.3049  3.0777  0.3891 -0.0701  0.0744 -1.7333  0.0902
0.0744 -1.7333  0.0902  2.8369  1.4092 -3.1398 -0.52
-4.1921 -1.0285 -0.9801  3.0777  0.3891 -0.0701  0.0744 -1.7333  0.0902
0.0744 -1.7333  0.0902  2.8369  1.4092 -3.1398 0.40
-4.9217  1.2977  0.4473  3.0777  0.3891 -0.0701  0.0744 -1.7333  0.0902
0.0744 -1.7333  0.0902  2.8369  1.4092 -3.1398 0.30
-2.6931 -2.5271 -1.2871  3.0777  0.3891 -0.0701  2.2261 -5.3648  0.3049
2.2261 -5.3648  0.3049  2.8369  1.4092 -3.1398 -1.00
-2.6931 -2.5271 -1.2871  3.0777  0.3891 -0.0701 -4.9217  1.2977  0.4473
0.0744 -1.7333  0.0902  2.8369  1.4092 -3.1398 1.57
-2.6931 -2.5271 -1.2871  3.0777  0.3891 -0.0701 -4.1921 -1.0285 -0.9801
0.0744 -1.7333  0.0902  2.8369  1.4092 -3.1398 0.59 : End of observations

１行目はタイトル行で読み飛ばされます。
２行目は観測値の数(n)、予測変数の数(mx)、応答変数の数(my)、予測変数のスケールの手法(iscale)、潜在変数の数(maxfac)を指定しています。"Nag_PredStdScale"はデータが変数の標準分散によってスケールされることを意味します。
３行目はどの予測変数がモデルに含まれるか(isx)を指定しています。
４～３３行目は予測変数の観測値(x)と応答変数の観測値(y)を指定しています。

出力結果

（本関数の詳細はnag_pls_orth_scores_svd のマニュアルページを参照）

この出力例をダウンロード

nag_pls_orth_scores_svd (g02lac) Example Program Results
 x-loadings, P
             1          2          3          4
  1     -0.6708    -1.0047     0.6505     0.6169
  2      0.4943     0.1355    -0.9010    -0.2388
  3     -0.4167    -1.9983    -0.5538     0.8474
  4      0.3930     1.2441    -0.6967    -0.4336
  5      0.3267     0.5838    -1.4088    -0.6323
  6      0.0145     0.9607     1.6594     0.5361
  7     -2.4471     0.3532    -1.1321    -1.3554
  8      3.5198     0.6005     0.2191     0.0380
  9      1.0973     2.0635    -0.4074    -0.3522
 10     -2.4466     2.5640    -0.4806     0.3819
 11      2.2732    -1.3110    -0.7686    -1.8959
 12     -1.7987     2.4088    -0.9475    -0.4727
 13      0.3629     0.2241    -2.6332     2.3739
 14      0.3629     0.2241    -2.6332     2.3739
 15     -0.3629    -0.2241     2.6332    -2.3739
 x-scores, T
          1       2       3       4
  1  -0.1896  0.3898 -0.2502 -0.2479
  2   0.0201 -0.0013 -0.1726 -0.2042
  3  -0.1889  0.3141 -0.1727 -0.1350
  4   0.0210 -0.0773 -0.0950 -0.0912
  5  -0.0090 -0.2649 -0.4195 -0.1327
  6   0.5479  0.2843  0.1914  0.2727
  7  -0.0937 -0.0579  0.6799 -0.6129
  8   0.2500  0.2033 -0.1046 -0.1014
  9  -0.1005 -0.2992  0.2131  0.1223
 10  -0.1810 -0.4427  0.0559  0.2114
 11   0.0497 -0.0762 -0.1526 -0.0771
 12   0.0173 -0.2517 -0.2104  0.1044
 13  -0.6002  0.3596  0.1876  0.4812
 14   0.3796  0.1338  0.1410  0.1999
 15   0.0773 -0.2139  0.1085  0.2106
 y-loadings, C
            1          2          3          4
 1      3.5425     1.0475     0.2548     0.1866
 y-scores, U
             1          2          3          4
  1     -1.7670     0.1812    -0.0600    -0.0320
  2     -0.6724    -0.2735    -0.0662    -0.0402
  3     -0.9852     0.4097     0.0158     0.0198
  4      0.2267    -0.0107     0.0180     0.0177
  5     -1.3370    -0.3619    -0.0173     0.0073
  6      8.9056     0.6000     0.0701     0.0422
  7     -1.0634     0.0332     0.0235    -0.0151
  8      4.2143     0.3184     0.0232     0.0219
  9     -2.1580    -0.2652     0.0153     0.0011
 10     -3.7999    -0.4520     0.0082     0.0034
 11     -0.2033    -0.2446    -0.0392    -0.0214
 12     -0.5942    -0.2398     0.0089     0.0165
 13     -5.6764     0.5487     0.0375     0.0185
 14      4.3707    -0.1161    -0.0639    -0.0535
 15      0.5395    -0.1274     0.0261     0.0139

Explained Variance
Model effects   Dependent variable(s)
   16.902124   89.638060 
   29.674338   97.476270 
   44.332404   97.939839 
   56.172041   98.188474

２～１９行目にx ローディング（負荷量）が出力されています。
２０～３７行目にx スコアが出力されています。
３８～４１行目に y ローディング（負荷量）が出力されています。
４２～５９行目にが yスコアが出力されています。
６１～６６行目に説明分散（因子寄与）のモデル効果と従属変数について出力されています。予測変数の累積寄与率と応答変数の累積寄与率がそれぞれ出力されています。

ソースコード

（本関数の詳細はnag_pls_orth_scores_svd のマニュアルページを参照）

※本サンプルソースコードはnAG数値計算ライブラリ（Windows, Linux, MAC等に対応）の関数を呼び出します。
サンプルのコンパイル及び実行方法

このソースコードをダウンロード

/* nag_pls_orth_scores_svd (g02lac) Example Program.
 *
 * CLL6I261D/CLL6I261DL Version.
 *
 * Copyright 2017 Numerical Algorithms Group.
 *
 * Mark 26.1, 2017.
 */
/* Pre-processor includes */
#include <stdio.h>
#include <math.h>
#include <nag.h>
#include <nag_stdlib.h>
#include <nagg02.h>
#include <nagx04.h>

int main(void)
{
  /*Integer scalar and array declarations */
  Integer exit_status = 0;
  Integer i, ip, j, maxfac, mx, my, n;
  Integer pdc, pdp, pdt, pdu, pdw, pdx, pdxres, pdy, pdycv, pdyres;
  Integer *isx = 0;
  /*Double scalar and array declarations */
  double *c = 0, *p = 0, *t = 0, *u = 0, *w = 0, *x = 0, *xbar = 0;
  double *xcv = 0, *xres = 0, *xstd = 0, *y = 0, *ybar = 0;
  double *ycv = 0, *yres = 0, *ystd = 0;
  /*Character scalar and array declarations */
  char sscale[40];
  /*nAG Types */
  Nag_OrderType order;
  Nag_ScalePredictor scale;
  NagError fail;

  INIT_FAIL(fail);

  printf("nag_pls_orth_scores_svd (g02lac) Example Program Results\n");
  /* Skip header in data file. */
  scanf("%*[^\n] ");
  /* Read data values. */
  scanf("%ld%ld%ld%39s %ld%*[^\n] ",
        &n, &mx, &my, sscale, &maxfac);
  scale = (Nag_ScalePredictor) nag_enum_name_to_value(sscale);

  if (!(isx = nAG_ALLOC(mx, Integer)))
  {
    printf("Allocation failure\n");
    exit_status = -1;
    goto END;
  }
  for (j = 0; j < mx; j++)
    scanf("%ld ", &isx[j]);
  scanf("%*[^\n] ");
  ip = 0;
  for (j = 0; j < mx; j++) {
    if (isx[j] == 1)
      ip = ip + 1;
  }
#ifdef nAG_COLUMN_MAJOR
  pdc = my;
  pdp = ip;
  pdt = n;
  pdu = n;
  pdw = ip;
  pdx = n;
#define X(I, J)    x[(J-1)*pdx + I-1]
  pdxres = n;
  pdy = n;
#define Y(I, J)    y[(J-1)*pdy + I-1]
  pdycv = maxfac;
#define YCV(I, J)  ycv[(J-1)*pdycv + I-1]
  pdyres = n;
  order = Nag_ColMajor;
#else
  pdc = maxfac;
  pdp = maxfac;
  pdt = maxfac;
  pdu = maxfac;
  pdw = maxfac;
  pdx = mx;
#define X(I, J)    x[(I-1)*pdx + J-1]
  pdxres = ip;
  pdy = my;
#define Y(I, J)    y[(I-1)*pdy + J-1]
  pdycv = my;
#define YCV(I, J)  ycv[(I-1)*pdycv + J-1]
  pdyres = my;
  order = Nag_RowMajor;
#endif
  /* Assign parameter values to corresponding variables */
  if (!(c = nAG_ALLOC(pdc * (order == Nag_RowMajor ? my : maxfac), double)) ||
      !(p = nAG_ALLOC(pdp * (order == Nag_RowMajor ? ip : maxfac), double)) ||
      !(t = nAG_ALLOC(pdt * (order == Nag_RowMajor ? n : maxfac), double)) ||
      !(u = nAG_ALLOC(pdu * (order == Nag_RowMajor ? n : maxfac), double)) ||
      !(w = nAG_ALLOC(pdw * (order == Nag_RowMajor ? ip : maxfac), double)) ||
      !(x = nAG_ALLOC(pdx * (order == Nag_RowMajor ? n : mx), double)) ||
      !(xbar = nAG_ALLOC(ip, double)) ||
      !(xcv = nAG_ALLOC(maxfac, double)) ||
      !(xres = nAG_ALLOC(pdxres * (order == Nag_RowMajor ? n : ip), double))
      || !(xstd = nAG_ALLOC(ip, double))
      || !(y = nAG_ALLOC(pdy * (order == Nag_RowMajor ? n : my), double))
      || !(ybar = nAG_ALLOC(my, double))
      || !(ycv =
           nAG_ALLOC(pdycv * (order == Nag_RowMajor ? maxfac : my), double))
      || !(yres =
           nAG_ALLOC(pdyres * (order == Nag_RowMajor ? n : my), double))
      || !(ystd = nAG_ALLOC(my, double)))
  {
    printf("Allocation failure\n");
    exit_status = -1;
    goto END;
  }
  /* Read data values. */
  for (i = 1; i <= n; i++) {
    for (j = 1; j <= mx; j++)
      scanf("%lf ", &X(i, j));
    for (j = 1; j <= my; j++)
      scanf("%lf ", &Y(i, j));
  }
  scanf("%*[^\n] ");
  /* Fit a PLS model. */
  /*
   * nag_pls_orth_scores_svd (g02lac)
   * Partial least squares
   */
  nag_pls_orth_scores_svd(order, n, mx, x, pdx, isx, ip, my, y, pdy, xbar,
                          ybar, scale, xstd, ystd, maxfac, xres, pdxres,
                          yres, pdyres, w, pdw, p, pdp, t, pdt, c, pdc, u,
                          pdu, xcv, ycv, pdycv, &fail);
  if (fail.code != NE_NOERROR) {
    printf("Error from nag_pls_orth_scores_svd (g02lac).\n%s\n",
           fail.message);
    exit_status = 1;
    goto END;
  }
  /*
   * nag_gen_real_mat_print (x04cac)
   * Print real general matrix (easy-to-use)
   */
  fflush(stdout);
  nag_gen_real_mat_print(order, Nag_GeneralMatrix, Nag_NonUnitDiag, ip,
                         maxfac, p, pdp, "x-loadings, P", 0, &fail);
  if (fail.code != NE_NOERROR) {
    printf("Error from nag_gen_real_mat_print (x04cac).\n%s\n", fail.message);
    exit_status = 1;
    goto END;
  }
  /*
   * nag_gen_real_mat_print (x04cac)
   * Print real general matrix (easy-to-use)
   */
  fflush(stdout);
  nag_gen_real_mat_print(order, Nag_GeneralMatrix, Nag_NonUnitDiag, n,
                         maxfac, t, pdt, "x-scores, T", 0, &fail);
  if (fail.code != NE_NOERROR) {
    printf("Error from nag_gen_real_mat_print (x04cac).\n%s\n", fail.message);
    exit_status = 1;
    goto END;
  }
  /*
   * nag_gen_real_mat_print (x04cac)
   * Print real general matrix (easy-to-use)
   */
  fflush(stdout);
  nag_gen_real_mat_print(order, Nag_GeneralMatrix, Nag_NonUnitDiag, my,
                         maxfac, c, pdc, "y-loadings, C", 0, &fail);
  if (fail.code != NE_NOERROR) {
    printf("Error from nag_gen_real_mat_print (x04cac).\n%s\n", fail.message);
    exit_status = 1;
    goto END;
  }
  /*
   * nag_gen_real_mat_print (x04cac)
   * Print real general matrix (easy-to-use)
   */
  fflush(stdout);
  nag_gen_real_mat_print(order, Nag_GeneralMatrix, Nag_NonUnitDiag, n,
                         maxfac, u, pdu, "y-scores, U", 0, &fail);
  if (fail.code != NE_NOERROR) {
    printf("Error from nag_gen_real_mat_print (x04cac).\n%s\n", fail.message);
    exit_status = 1;
    goto END;
  }
  printf("\n");
  printf("%s\n", "Explained Variance");
  printf("%12s%24s\n", "Model effects", "Dependent variable(s)");
  for (i = 1; i <= maxfac; i++) {
    printf("%12.6f", xcv[i - 1]);
    for (j = 1; j <= my; j++)
      printf("%12.6f%s", YCV(i, j), j % 10 ? " " : "\n");
    printf("\n");
  }

END:
  nAG_FREE(c);
  nAG_FREE(p);
  nAG_FREE(t);
  nAG_FREE(u);
  nAG_FREE(w);
  nAG_FREE(x);
  nAG_FREE(xbar);
  nAG_FREE(xcv);
  nAG_FREE(xres);
  nAG_FREE(xstd);
  nAG_FREE(y);
  nAG_FREE(ybar);
  nAG_FREE(ycv);
  nAG_FREE(yres);
  nAG_FREE(ystd);
  nAG_FREE(isx);

  return exit_status;
}

特異値分解を用いた部分最小二乗（PLS）回帰

C言語によるサンプルソースコード : 使用関数名：nag_pls_orth_scores_svd (g02lac)

概要

入力データ

出力結果

ソースコード