Rkg Presentation

download Rkg Presentation

of 25

Transcript of Rkg Presentation

  • 8/4/2019 Rkg Presentation

    1/25

    MICRO LEVEL FORECASTS FOR INDIAS EXPORT SECTOR

    SPECIFIC COUNTRIES AND SPECFIC COMMODITIES

    Analytics & Modelling Division

    NATIONAL INFORMATICS CENTRE

    Department of Information Technology

    Ministry of Communication & Information Technology

    New Delhi-110003

  • 8/4/2019 Rkg Presentation

    2/25

    Major input to Indias export model

    for a financial year

    Input to an econometric model to derive macro-level

    forecasts for strategic planning for Indias export RIS

    Study

    NIC has developed micro-level forecasts for a financial

    year for specific country and specific commodities (Totalvariables: 319)

  • 8/4/2019 Rkg Presentation

    3/25

    Monthly time series behavior is captured through Neural networkmethodology.

    Final model selected has been simulated with-in and outside

    sample and once stabilized with regard to error statistics forecasts

    are generated .

    4Thought/Freefore is the state-of-the-art software tool from

    COGNOS which has been used to simulate and generate micro-

    level forecast Indias export for a financial year.

    The reliability of the forecasts and the degree of confidence are

    part of the final model

    Tools and technologies used :

  • 8/4/2019 Rkg Presentation

    4/25

    Table A: SUMMARY OF COUNTRY WISE DATA-SETS

    (Time Series Forecasting Carried for the listed number of

    data sets)

    Country List Com-Codes

    Var.

    for

    each

    Code

    Total

    VblsExports Imports UVI ROW

    Canada 13 4 52+2(rest)+1(all)

    =55

    Apr 1996to June

    2003

    Jan 1995to Nov

    2003

    Jan 1995to Nov

    2003

    Jan 95 toNov 2003

    USA 17 4 68 Apr 1996to May

    2003

    Jan 1993

    to Oct

    2003

    Jan 1993

    to Oct

    2003

    Jan93 to

    Oct 2003

    China 10 4 40 Apr 1996to May

    2003

    Jan 1995

    to Nov

    2003

    Jan 1995

    to Nov

    2003

    Jan 1995

    to Nov

    2003

    Japan 11 4 44 Apr 1996to June

    2003

    Jan 1994

    to Nov.

    2003

    Jan 1994

    to Nov.

    2003

    Jan 1994

    to Nov

    2003

  • 8/4/2019 Rkg Presentation

    5/25

    Table A: Contd.

    Country List Com-Codes

    Var.

    for

    each

    Code

    Total

    VblsExports Imports UVI ROW

    Malaysia * 1 1 1 Apr 1996 toAug 2002

    NA NA NA

    Singapore* 1 1 1 Apr 1996 to

    Aug 2002

    NA NA NA

    Thailand* 1 1 1 Apr 1996 toAug 2002

    NA NA NA

    Hong Kong* 1 1 1 Apr 1996 toAug 2002

    NA NA NA

    Rest ofWorld*

    1 1 1 Apr 1996 to

    Aug 2002

    NA NA NA

    * Only single variable total export of All Commodities from India is

    considered

  • 8/4/2019 Rkg Presentation

    6/25

    Country List Com-Codes

    Var.

    for

    each

    Code

    Total

    VblsExports Imports UVI ROW

    European

    Union

    26 4 104

    +2 (rest)

    +1(all)

    =107

    Apr 1996

    to June

    2003

    Jan 1996

    to June

    2003

    Jan 1996

    to June

    2003

    Jan 1996

    to June

    2003

    TOTAL 319

    No. of Obs

    (Range)

    77-92 90-130 90-130 90-130

    PeriodRange

    Apr 1996to June

    2003

    Jan 1993to Nov

    2003

    Jan 1993to Nov

    2003

    Jan 1993to Nov

    2003

    Table A: Contd.

    Includes both the series- monthly as well as annual - with 26 items in each

    series.

  • 8/4/2019 Rkg Presentation

    7/25

    Univariate ARIMA MODEL

    1. In regression analysis, if the error terms are not independent i.e.

    autocorrelated, the efficiency of the ordinary least-square (OLS) parameter

    estimates gets adversely affected and the standard error estimates are

    biased.

    2. Auto Regressive Integrated Moving Average (ARIMA) model is fit for data

    with autocorrelated errors. This happens frequently with time series data.3. The ARIMA procedure analyzes and forecasts equally spaced univariate

    time series data, transfer function data, and intervention data using the

    autoregressive moving-average or the more general autoregressive

    integrated moving-average (ARIMA) model.

    4. An ARIMA model predicts a value in a response time series as a linear

    combination of its own past values, past errors, and current and past values

    of other time series.

  • 8/4/2019 Rkg Presentation

    8/25

    An ARIMA model contains three different kinds of parameters:

    the p AR-parameters;

    the q MA-parameters;

    and the variance of the error term.

    This amount to a total of p + q + 1 parameters to be estimated.

    These parameters are always estimated on using the stationary

    time series (a time series which is stationary with respect to its

    variance and mean).

    Univariate ARIMA MODELContd.

  • 8/4/2019 Rkg Presentation

    9/25

    NEURAL NETWORK

    Neural networks cannot do anything that cannot be done using traditional

    computing techniques, BUT they can do some things which would

    otherwise be very difficult (time consuming).

    Neural networks form a model from training data (or possibly input data)

    alone.

    This is particularly useful when time series behavior is complex, and

    forecasts for a period is input for the next period forecast.

    In a time series, behavior is complex, follows an unknown pattern, has

    large number of variables, Neural networks learns from the past behavior to

    develop corresponding complex algorithm and then predicts. (ARIMA:Univariate, Multivariate)

  • 8/4/2019 Rkg Presentation

    10/25

    NEURAL NETWORK

    -Neural networks are a form of multiprocessor computer system, with

    simple processing elements

    a high degree of interconnection

    simple scalar messages

    adaptive interaction between elements

    A biological neuron may have as many as 10,000 different inputs, and may

    send its output (the presence or absence of a short-duration spike) to many

    other neurons.

    Neurons are wired up in a 3-dimensional pattern.

  • 8/4/2019 Rkg Presentation

    11/25

    Example

    A simple single unit adaptive network:

    The network has 2 inputs I0 and I1, and one output. All are binary.

    If W0 *I0 + W1 * I1 + Wb > 0, then Output is 1

    If W0 *I0 + W1 * I1 + Wb

  • 8/4/2019 Rkg Presentation

    12/25

    Feed Forward Neural Network

  • 8/4/2019 Rkg Presentation

    13/25

    1. EU

    Model Statistics

    Model fit: 75.5004Test fit: 78.4198

    Overall fit: 76.4137Adjusted fit: 65.3762Iterations: 69RMS error: 16.0265Standard deviation: 16.116395% confidence interval: 32.2326Mean absolute error: 12.5406Mean absolute error (%): 8.7764F-Statistic: 20.7884

    Durbin-Watson Statistic: 1.0007

    A. 30613 (Import of Shrimps and prawns frozen )

  • 8/4/2019 Rkg Presentation

    14/25

    STATISTICAL MEASURES

    Model fit

    A measure of how well the model fits to the original data used in modeling.100% represents a perfect fit. The model fit would approach 0% if you guessed

    the average value for the target. If the value is negative, the fit is worse than if

    you had guessed the average value for the target (that is, you had a naive

    model). The model fit is based on an adaptation of the standard R^2 statistic

    (that is, the proportion of the relationship explained between two variables).

    Adjusted fit

    The overall fit adjusted for the number of factors, and the number of rows of

    data contained in the model. This assumes that a more complex model or lessdata will produce a less predictive model.

  • 8/4/2019 Rkg Presentation

    15/25

    Test fit

    The percentage of variation in the test set explained by the model. Test fit (or

    percent test fit) is a measure of how well the model predicts the test data,

    and is the best measure of the genuine predictive performance of the model.

    The test fit is an adaptation of the standard R^2 statistic. Unlike the model

    fit, the test fit can be negative. This happens if the current model yields a

    less accurate prediction of the test set than the naive model.

    Overall fit

    An indicator of the model quality, and is a combination of the model fit and

    the test fit. The overall fit is the percentage of the variation explained in thedependent variable.

  • 8/4/2019 Rkg Presentation

    16/25

    B. 90111 (Export of Coffee neither roasted nor decaffeinated

    Model Statistics

    Model fit: 75.6046Test fit: 73.7038Overall fit: 75.2571

    Adjusted fit: 64.0117Iterations: 54RMS error: 4.4336Standard deviation: 4.459395% confidence interval: 8.9186Mean absolute error: 3.1465Mean absolute error (%): 34.767F-Statistic: 18.7563

    Durbin-Watson Statistic: 0.5446

  • 8/4/2019 Rkg Presentation

    17/25

    C. 251611 (Import of Granite,crude/rough )

    Model Statistics

    Model fit: 67.3539Test fit: 61.8533Overall fit: 66.0773

    Adjusted fit: 56.5328Iterations: 66RMS error: 3.4094Standard deviation: 3.428595% confidence interval: 6.857Mean absolute error: 2.7858Mean absolute error (%): 6.6183F-Statistic: 12.4989

    Durbin-Watson Statistic: 2.122

  • 8/4/2019 Rkg Presentation

    18/25

    2. CHINAA. 670300 (Import of Human Hair, dressed, thinned, bleached or otherwise worked; wool or other

    animal hair or other textile materials, prepared for use in making wigs or the like )

    Model StatisticsModel fit: 85.0775Test fit: 84.3229Overall fit: 84.9804

    Adjusted fit: 74.6557Iterations: 30RMS error: 1.0522Standard deviation: 1.057195% confidence interval: 2.1143Mean absolute error: 0.7224Mean absolute error (%): 24.07F-Statistic: 44.3208

    Durbin-Watson Statistic: 1.2491

  • 8/4/2019 Rkg Presentation

    19/25

    B. CHINA (Import of rest of the codes)

    Model StatisticsModel fit: 87.8544Test fit: 82.4129Overall fit: 87.1099Adjusted fit: 76.5264

    Iterations: 126RMS error: 2828.6593Standard deviation: 2841.970795% confidence interval: 5683.9414Mean absolute error: 2114.0386Mean absolute error (%): 12.5192F-Statistic: 52.9366

    Durbin-Watson Statistic: 0.8763

  • 8/4/2019 Rkg Presentation

    20/25

    C. CHINA (Unit value index for rest of the codes)

    Model StatisticsModel fit: 61.607Test fit: 76.4597Overall fit: 66.02Adjusted fit: 57.6874

    Iterations: 46RMS error: 6.1855Standard deviation: 6.215795% confidence interval: 12.4314Mean absolute error: 4.2899Mean absolute error (%): 4.5121F-Statistic: 14.5718

    Durbin-Watson Statistic: 0.9655

    3 USA

  • 8/4/2019 Rkg Presentation

    21/25

    3. USA

    MODEL STATISTICS IN TERMS OF THE ORIGINAL DATA

    Number of Residuals (R) =n 70

    Number of Degrees of Freedom =n-m 62

    Residual Mean =Sum R / n .683103E-02

    Sum of Squares =Sum R**2 121.321

    Variance var=SOS/(n) 1.73316

    Adjusted Variance =SOS/(n-m) 1.95679

    Standard Deviation =SQRT(Adj Var) 1.39885

    Standard Error of the Mean =Standard Dev/ .177655

    Mean / its Standard Error =Mean/SEM .384512E-01

    Mean Absolute Deviation =Sum(ABS(R))/n .992518

    AIC Value ( Uses var ) =nln +2m 54.4962

    SBC Value ( Uses var ) =nln +m*lnn 72.4841

    BIC Value ( Uses var ) =see Wei p153 -95.0882

    R Square = .887551

    Durbin-Watson Statistic =[A-A(T-1)]**2/A**2 1.95492

    D-W STATISTIC SUGGESTS NO SIGNIFICANT AUTOCORRELATION for lag1

    A. 420310 (Import of Articles of apparel )

  • 8/4/2019 Rkg Presentation

    22/25

    MODEL STATISTICS IN TERMS OF THE ORIGINAL DATA

    Number of Residuals (R) =n 103

    Number of Degrees of Freedom =n-m 97

    Residual Mean =Sum R / n -.783408E-14

    Sum of Squares =Sum R**2 1578.37

    Variance var=SOS/(n) 15.3239

    Adjusted Variance =SOS/(n-m) 16.2718

    Standard Deviation =SQRT(Adj Var) 4.03383

    Standard Error of the Mean =Standard Dev/ .409574

    Mean / its Standard Error =Mean/SEM -.191274E-13

    Mean Absolute Deviation =Sum(ABS(R))/n 3.10562

    AIC Value ( Uses var ) =nln +2m 293.130

    SBC Value ( Uses var ) =nln +m*lnn 308.938

    BIC Value ( Uses var ) =see Wei p153 -26.2750

    R Square = .858561

    Durbin-Watson Statistic =[A-A(T-1)]**2/A**2 1.88808

    D-W STATISTIC SUGGESTS NO SIGNIFICANT AUTOCORRELATION for lag1.

    B. 570110 ( Import of Carpets and other textile coverings of wool or fine animal hair

  • 8/4/2019 Rkg Presentation

    23/25

    MODEL STATISTICS IN TERMS OF THE ORIGINAL DATA

    Number of Residuals (R) =n 105

    Number of Degrees of Freedom =n-m 99

    Residual Mean =Sum R / n -.708456E-01

    Sum of Squares =Sum R**2 10575.5

    Variance var=SOS/(n) 100.719

    Adjusted Variance =SOS/(n-m) 106.824

    Standard Deviation =SQRT(Adj Var) 10.3355

    Standard Error of the Mean =Standard Dev/ 1.03876

    Mean / its Standard Error =Mean/SEM -.682020E-01

    Mean Absolute Deviation =Sum(ABS(R))/n 7.73821

    AIC Value ( Uses var ) =nln +2m 496.295

    SBC Value ( Uses var ) =nln +m*lnn 512.219

    BIC Value ( Uses var ) =see Wei p153 165.540

    R Square = .848765

    Durbin-Watson Statistic =[A-A(T-1)]**2/A**2 2.04567

    D-W STATISTIC SUGGESTS NO SIGNIFICANT AUTOCORRELATION for lag1.

    C. 610510 (Import of Men's or boys' shirts of cotton, knitted or crocheted )

  • 8/4/2019 Rkg Presentation

    24/25

    Conclusion :

    Time Constraint :

    No. of independent variable for which forecast are to be generated isapproximately 319.

    As the time series data keep coming over time and forecasts are to be

    generated based on the latest monthly time series data within a period of

    approximately 2 weeks forecasts are to be generated for 319 independentvariables.

    Each variable forecast is an independent exercise.

    Existing software tools arte not fully automated and the subject and toolspecialist intervention is a must.

    Traditional Statistical/Econometric model techniques/software tools are

    major constraint in terms of automation.

  • 8/4/2019 Rkg Presentation

    25/25

    What is Required :

    NIC can develop fully automated forecasting system by

    developing algorithms and testing with state-of-the-art tools

    available with limited interface.

    The state of the art software tool and techniques will require

    funding. Manpower and resource mobilization to the tune of

    Rs. 10 lakhs and for a period of 8 months.