AN ANALYSIS OF PROGRAM BEHAVIOR - 筑波大学...E工S-TR-76-3 * AN ANALYS工S O:F PROGRAM...
Transcript of AN ANALYSIS OF PROGRAM BEHAVIOR - 筑波大学...E工S-TR-76-3 * AN ANALYS工S O:F PROGRAM...
EIS-T R 一一 76-3
{:fiiii}’〉“’r Dj}/!lgfift..”nj.ig..:;.
義一
、蛙 竃圏発
AN ANALYSIS OF PROGRAM BEHAVIOR
by
Tomoo Nakamura
Hajime Kitagawa
Hiroshi Hagiwara
July 7,1976
E工S-TR-76-3
*
AN ANALYS工S O:F PROGRAM BEHAV工OR
**
Tomoo Nakamura
***
Hajime Kitagawa
****
Hiroshi Hagiwara
July 7, 1976
Abstrac’ヒ
As computer systems have increased their complexity rit has become very important ・ヒ。 analyze the systemefficiency. 工n ’ヒhe anaZysis of compu’ヒer systems r it isfairly of use to have the knowledge of dynamic behaviorof programs under execution. This paper is an empiricalstudy of program behavior. We prepared an address tracerwhich recorded the memory reference patterns of programsand got the traced data for severai programs. By usingthese da’ヒa we analyzed some kinds of proqram behavior:run length; branch dis’ヒance; memory demand, stackdistance frequency.for typical replacemen・ヒ algori’ヒhms(]li RU and OPT); instruction buffering. 工n this paper
we present the results of the measurement and analysis.
*This report is a revised edition of our paper appeared in Journalof 工nformation Processing Society of Japan,Vol.15,No.1(1974).★★工nstitute of Elec七ronics and 工nformation Science, University ofTsukuba, :Niihari-Gun, 工baraki 300-31, Japan★★★Data Processing Center, 1くyoto Universi・ヒy, Sakyo-ku r Kyoto 606,Japan★★★★Department of 工nformation Science, Kyoto University, Sakyo-ku,Kyo’ヒ0 606, Japan
工. 工ntroduction
As computex systems have increased,.their complexity,. it has
become very.important to evaluate. the .syStem performance or.
analyze the .system efficiency., .Tn the analysis of・ computer
systems,・・:the・re are.many cases in which・i七is Qf use・to・I have・・:ヒ}}e
knowledge of/ the dynamic・behavior of programs under-execution.
TypiCaユ examples・of the cases.are the・analysi.s・of.’paqed.:nemQry・’・ ;r
systems i!.’7’一?,一buffer. rnemor.y-systems”r7’8? jnterieaved ・・rnemory/ ・sy.stems,’6>
and segmentationed rnemory systems. ’ D・ C,. ... , L-
The information about program behavior can be obtained iby
measuring and analyzing ’ヒhe memory.reference・patterns gf・aC’しual
prograrns under,execu.tion. Then,. We prepared an addr’ess,tracer
which recorded・the memory yeference patterns・・ pE user一, pr.Qgxams.・r
and go・ヒ ・ヒhe ・ヒraced data for several programs・ By usi.ng.’ヒhese
da’ヒa.We analyzed. some・kinds of. program behavi◎r :.一run.・length.; branch
s>
distance;・memory (皐emand7・.stack.diS’ヒance frequenCy for ’ヒypiqa工
replacement・algorith璋s .(:LRU.and OP壬);.ins七ruction,.1一,uf’feri.n.g..
工n this paper we present the results of the meaSurement. and,
analysisi・...’;一,’”J一.k.一・/ ・’ .・ 一 一 ・ ・..,’一’・. .,’・ .,.・. ’.r・
[1]
,
工工. Address t士acing
To meatture the dynamic behavior of prograMs ahd to make the
g)
inpUt data £or memory reference simulation, we have prepared the
address tracer by which rnemory referencing patterns of user
programs during execution can be recorded. The address tracer runs
under the FACOM 230-60 batch operating systems. The tracer is
linkage-edited with the object programs being ’ヒraced, and only
user program Mode instructions of the object program are recorded
during exedution.
The unit of data which is to be sequentiaily recorded on
magnetic tape consis’ヒs of,
(1) address of the ins’ヒruction which references memory or causes
tr孕nsfer of control(ゴ㎜p or monitor-ca11),and its instruct-
ion code r
(2) address’of the referenced memory ( including aU the referenced
addresses in the case of indirect addressing ),
(3) type of the referencq; i“.e., load/store as an operand or
ins’ヒruction feヒch.
We selected several FORTRAN ,programs as the benchmark.s and
got. the traced data while they were being compiled and being
executed. ( See Table 1. )
The moderate slow-down ra’ヒe of execution time under tracing is
obtained;’ 50:1 tb 80:1. The address tracer is written in FASP
assembler language and its size is about 700 words.
[2]
Table 1
Hw一
trace da:ねcompile撃??ソ=
comp・@type
黎’汲奄o
f図04)貝・!。 ㌦ん 曳1。
C 38 22 2QO 130 670 a1Differentiai
@ equation
@ RKG E 9 47 171 14.8 681 a2
C
㍉)霞z●∈
c’一
カ22 38 83’・ 2t8 11.2 670 b1
E 14 79 310 1q5 58.5 b2
ElGENVALUE撃nx10 0symm
ルズiacobi
C *38 177 217 11.3 670 C1
ε 14 39 25β 1t6 .62.6 c2
Sorting C 潔38 38 2tO 12.8 662 d1
E・ 9 50 2Z3 18、8「 58.9 d2
BMSimu』≧tor E 16 5Q1 26.7 138 59.5 e2
f
PR
pw
PX
PF
and
: frequency o£ data reference by CPU r
: frequency of operand fetch / f,
: frequency of operand store. / fr
: frequency of ’奄獅唐狽窒浮モ狽奄盾氏@fetch / f,
= frequency of instruction or operand fe・ヒch / f,
P.R+Pw’+Px=1. PF=1-Pw・
ehe pmounY of core memory used*b.y ・the ’overlayed program
T士acing was 9・ヒoPPed when f = 500,000..
露
T11. bv!easurements of pro№窒≠香@behavlor
We made s’ヒatistical analysis by using the ’ヒraced da’ヒa and
characterized・the dynamic behavior of programs in the following
points of vieW.
3.1 Run length
A run length i’s defined by Sisson and FlynnG) as ”the numbe’r
of addres,s’@eS increasing’@i“ ’v,alue’ by one over’ the previous’ address
in an addressing pa’ttern”./
We’.defins S as the distribution bf run lengths such that:
s([m,n])is、t、lle pr・babi・.lity that.・sis betwee辞職nd. n←・ぐn・)’:
where s is’the’number of instructions which ’are executed
sequentiaZZy.
Some results of run lenqth dis・ヒribution are indicated in Fig.1.
FユX・1sh・ws that s’s「卑e白s・thrn‘ab・u七.5賦h≒’gh.pr・bab”:璽やr
all the measured programs.’
3.21 Branch distance
The bFang.h distan,ce (7istr.i;b. ution B iS definqd qs follos“s:・
B([’窒吹Cn]) i’s tt}e donditional probability that h is bctween’ 奄獅戟@and.n
(mくη.)岬heギrわis the branch、 dfstrワceダ‡・e・.”92一αエ’bepwee,n
abranch 閧浮ョn’・n(’address” aエ:)and the next’nstructi・n
(dddre16s g2)wゆbゆrhl’s executed・(わ≠0’エ)
TII.e lk?ican¢h distance!distributi’on as., w.ell.as the r.Up. length
distriPutittn may..indicqte., the charqcF,eri$t,lz’gs of prograMrstrUetqres
.suc吹@as the s.ize .and ogc}rence freqgency gifilooPs・
Some results of bicanch distanqe’distt’ibution are indicated
in Fig.2. Fig.2 shows that the probability・ Pop〈 2S・h〈10 ) is high
for all ’ヒ:he measured programs・
[4]
S([m,n])
Fig.1 Run length1.O
O.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0,1
a2 eai RXx ・
Z)2 RNX
N
N
d2
e2 O〉 x“
1べr\\
XN N xX > Ngs. / N bx))?〈× Xbg
;.>>・ x
黛
S([m,n]) = PM( msg.ssgn )
、卜
N[lr5]
B([m,n])
O.8
0.7
0.6
e・. s
O.4
0.3
0.2
0.1
[6,10] [ll,ユ,5] [16,◎el
Fig.2 Branch distance
AO
...
/QN矧;,;;,.,cSfi’SA i,
:
z
n‘
Al
只h2八1
!/ z、\
溢
[Mrn1
・A2 A3 A4 A5
N
N “tee. 一 一rr R“
t ’V NNL
A6 A7 A8 A9
A2 ;’ [一lo3,.10i j
・,.“ C? r・ [:・1,03/; 6io],
’A5;[ 2,ZO]『’`.67’1[・・10’102-1こ・…
A7;[ 102,lo3]「A8ラTユ03,io匂A9;1 ,iO4),ee 」.
B ([ln ., n])
=酬m≦わμ? /Pr(わメ。入b≠エ)
[M rn]
[5 ][
工nFig.2i the proportions of the number of branch ihstructions
executed to the total dynamic instruction steps. are as follows:
ao9 : 28.,8%,
h2 : 32.6%,
d2 : 14・2%r
e2 : 14.2%.
3.3 Merpory d,emand
We defin,e the amount of memory demand D as fo. Ilows:
D( T7m ) is the・average number of distinct blocks tha’ 煤@are requiired
by.M successive memory references du.ring execution of a program,
where the memory space is divided into bZocks each consisting of
m contiguous addresses.
D(?7m) corresponds to the average working set size2>of each’
programr and x corresponds to the window size.2) D(T;m) is
monotonically nondecreasing in y.
Fig.3 shows the resuits of the case where m = 8 and oo = 1 .一v
エ5000. 工n th6 case of α2, h2 0r e 2, 1)(:P;m) goes up sharp:Ly to ・ヒhe
璽saturation poin’ヒs冒. However, in the case of oエ (one of typical
pat’ヒerns in do叩pilatiQn phase ), D(r7M) does not become 響satura七ed i
while w・≦ エ5000. As far as the data shown in :Fig.3, ・ヒhe amounセ
of memory required by 15000 references are less than 2000words
for all the programs and especially less than 1000 words for h2,
e2 and d2.’
[61
D(T;m)
250(2000)
m= 8
Fig.3 Memory demand
200
150
(IOOO)
IOO
50
blocks(words)
o
oエ
/ / ノ1
戸
/
//
t/
戸
1.L 一e”/’ ,’ノ
tl’,1
易
‘z
/
一.o一一一e一一.・e一一= e’一
=一“一一一一e一一一e.一一一e一一一e一一一e一一一e,a 2
わ2
一.一一一._一.一_・一一一・一一一・一一一・e一一一e一一●一一●一一●欄一一●ソ
n卜
一
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 (xlOOO)
3.4 LRU stack distance frequency
The.LRU ( Zeast recently used ) stack di$tance frequency
5)defined by Maセtson et al. is a measure ’ヒha’ヒ indicateS program’s、
locality of memory reference’ apd can be used to ob’ヒain the access
rates to buffer” 高?f @bicYbr’main inbmory”in the simulation of buffer
memory systems or .paged memory systems.
We define R(g’;m) as the cumulative distribution of LRU stack・
dis’ヒance frequency such ’ヒha’ヒ1,
R(g’;m) is the cumulative distril?.ution of the. relative frequency
that the address of memoicy refe’icenced is contained in the g’ ”most
recently used” distinct bZocks if the memOry is divided into blocks
of m con・ヒiguous address6:s.
S・me results are sh・晦in Fig・4(a)(し4(f).エn Fig.4(a)and
4(b), the separate curve$.’ represent diff.erent block sizes.. On the
。the「hand’in Fig64@.ツ4(e)’the sepa「a七e cu「Yes「ep「esr弁t
different traced data. .:t is rernarkable’that the rapid ascent is
oPserved by increasing o’., from 1 to 2 in ,any case. This iS explained
by.the fact that mem・ryギeference string’s C・nsist。fセhb tw・
streams; L e.r the instrUction stream and /the,’operand ( data ).
S’ヒream.
曹 t ● Fig・4(c)’ 4(卓) or 4、(e) directly leφdS to ’ヒhe var:Lat■on ■n
the percentage og access,e. s to buffer meino¥.¥ ( block size mp.,) with
buffer memory caPacity m x g’ i’n the simuZation of buffer ’memQry
sys・ヒem壷hich is con・ヒrolled by fully associa’ti▽e mapPing and’LRU
replacement algorithm. From these figures, one can see tha’ 煤@the
access rates to bufEer memory made up of 2 blocks of 4 word-block
is abouti・50 %r and that the difference of the access rates among
[8]
R(g ;m)
1.O
O.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
Fig.4 IJRU stack.くa)
distance frequency
1024今x
512う・X
256」レx
l 2 8t”e
6 4み×
3 2‘sx
i 6RS
8のく
藍
x
a2
1 2 3 4 5 6 ’7 8 コ9 ・10・9
ヤ
R(の」,m)
1.O
O.9
0.8
0. 7
0. 6
0.5
0.4
0. 3
0.2
0.1
(b)
Il.lr,
1 2 3’x’S .5 6 7
[9T
8 9 zo
ゴ
R(g ;m)
i. o
O.9
0.8
0.7
0.6
0.5
0.4
0.3
0e2
0.1
栩=4
(c)
d2NXsL一一一
e2
b2a2
d1
10S ’referenCeS
R(ゴ川)
1.O
O.9
0.8
0.7
0.6
0.5
0.4
0.3
0e2
0.1
1
m=8
2 4 8 i6
(d)
d.2 L./;
xN/p
e2
ム/ぎ
e
b2
32 64 128 256
dl
a2
1di references
」
」
ユ 2 4 8 16
[ 10 ]
32 64 ユ28
R( 」,m)
1.O
O.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
m= 16
e2
(e)
d2ロー。窒P≡≡:o ZtS
苓
xli
A
多/●
わ2
a2
1 05.
dl
references
R( 97m)
1.O
O.9
0.8
0.7
0.6
0.5
0.4
0.3
0e2
0el
,
1 2・ 4 8
(f)
m= 8
e-e e/ b
e/ o/
//
/
メ
o
16
eo
X
32 64 128
でプefifS
ゴ
combined stream for α2コ
■nstruc七ion stream for α20pe「and stream fQ:r α2
comparison of the
a2
io5 references
ゴ、.
’ヒhree streams
ユ 2 4 8 16 32
[ ll ]
64 128
the measured programs is unexpectedly small at m x g’ = 8, 16 and
52. 工t is also seen that there’is no great diffe:rence in the
effects of various buffer memory structures at m x g’ = 32. Fig.4(f)
compares ・ヒhe characteristics of the three differen・ヒ streams for 1α27
i.e.’ instrUction stream, operand stream, and combined stream♂ 工n
・ヒhis example, R(g 7m) of the ins’ヒruction st:ream is above thaセ of
the operand stream while j≦ 32, and R(g ;m) of the operand s・ヒream
crosses over that of ・ヒhe instruction stream between j = 64 and
ゴ=エ28.
3.5’ nPT stack distance frequency
OPT stack distance frequency is the stack distance frdquency
by an optimum page /(block) replacement algorithm called opT.5)
OPT replacement algorithrp is not reaZizable in gn actual computer
system, but is a useful benchmark for the evaluation of any o’ヒher
replacement algori’ヒhm・
We denote Ro(u”’;m) to be the qumUlative distribution of OPT
stack distance frequency as the function of the number of Plpcks
g’ and the block size m.
S・rPe’.resu・t瞬e sh。wn in Fi弓・5(a)ん5(d)・Fiq・5(a)and 5(b)
show the variations of R i(g’;8) and R A(g’,・16) with g’. Fig.5(c) O’Y. . o
show刀@the variation of Ro(g’;m) with m X g’, in which the effects’
・fb1・c琴r手ze加n R。(鋼aF9. also indicated・エt is obvi。uSly
seen from Fig..5(c) that buffer memo.ry has a good performanc.e whiie
ゴ≧ 4. The effects of using OPT and LRU algori’ヒhm are compared
in Fig.5(d). Some difference appears at g’ = 4 and g’ = 8, but is
not so great.
[ 12 ]
1.O
O.9’・’
O.8
0.7
0.6
0es
O.4
0 .’ 3’
O.2
0.1
o(る,17;m)
.,m
Fig.5 OP[D stack distance frequency.
・ioe. .referdnces
Ro(ゴ,加)
1.O
O.9
0.8
0.7
0.6
0.5
0.4
0e3
0.2
0e1
1
ノη=16
on
2
(b)
4 8 .16 32 .64 128 256
→ ロ ロ』一一{コfO?Ott:;:tti一一“.
d2e2
1 O十
references
」
の
」
i 2 4 8
[ 13 ]
16 ’32 64 ’128
R( 」;m)
0
1.O
O.9
0.8
0.7
0・61 o/”’O.5
0.4
0.3 ノノ メ’ ..’ 80e2 6’ 栩.4
0.ユ
4 8
R( 」 7m)’Ro(ゴ3栩)
ユ.O
m= 160.9
0.8
0PT
O.7
・O.6
0.5
0e4
0.3a cO.2
9・i
T
(c)
,,ノ”
’
dl
p/..一=e(e一一一;xl:一一一:sAt A g’一一一1 ’lo4 references
.一”’一 32 64 12 8 16
初xゴ
16 32 64 128 256 512 1024 2048’4096
(.d)
…(二_LRU ,
OPT and LRU stack dis’ヒance frequency
dエ
‘ 10 references
ast 8 16 32 64 128」
[ユ41
3.6 Effect of instruction buffering
f(m) is defined as the probability ・ヒhat the’add:ress of an
instruction to be fetqhed is found in the bZock which contains
the just previously executed ins’ヒruction-word if memory space.is
divided into blocks each co:nsisdヒinq of m contiguous addresses.
0(m) is defined as the corresponding probabilitY-for operand’
streams..
1(m) = R(1;m) = RA(1;m) .for instruction streams o
O’(m) = R(1;m) = Ro(17m) for.opexand streams
[vhe distributions of 1(m) and O(m)’ are shown in Fig..6‘ [Dhe
variation of R(ゴ,4) with m (= 4× 」 ) for the itts’ヒruction’忌’tream
and operand streain of dl are illustrated in Fig.6 for refexence.
From Fig.6 it is see:n that the effec・ヒ of increasinすm on the.
access rates is greater than that of increasing the number of
blocks for the ins・ヒruction s・ヒream, but 七he effec・ヒ.of increasing
the number of blocks is greater than that of increasing m for
’ヒhe operand stream.
[ 15 ]
.z’ (m) ,o (m)
1; O
O.9
Oe8
O.7
O.6
O.5
O.4
O.3
O.2
o.i
e8’>k
e 2-shj
α2→ノ
dエ→ゾo
Fig・6エnstructi6n and’・perand bufferinq
t tt d, ’/’
dエ’ o/ o
02イ
ox
口
0 (m)
sIO’.一 references
4 8 16 32.64 128
[ 16 ]
工V. Concluding remarks
This,pape「.is・aρqmpan手gn pape「tg’璽. Si叫ation。f a co平Pμte「 B>
system with buff.er .memory ” (. E;SrTRr.76-2..,)..1,.Wethave prepared an
address’狽窒≠モ?秩@t・make・the input data f・r mem。ry reference s加ula七i。n・
At the sarr}e .Vime t. we ,..hq.ve. ;ne.gsurg.q,,.the pdy#.q.rpic .b, ehqyior of the test
pr・grams in several Pgints。f view・、.エn thiS PaPe「、theギes叫ts
obtained by rrie.a,spr.erqep. #$. are pF.e. .s. e.nte.d.. ... Al.though,S qnd B show
various pattgrns, according .tg .thg.di.f.feren.g e.oC p.yogram,.$. tru.cture,
Rhas some si叫ia喚i俘串、 be贈en、P「。9「r卑r・tt.エ#may.bρraidl that this
is one of generai characVeris.tics g.C me.rpory r.efg.rence .and. it keeps.
the effectiveness of buffeF memoFy oF p,agi. g r“eFngry stable .on vqriogs
ordinqry programs. Some of,the .result$ are al.so Vh.e.,outputs obtained
by the memory refere.nq..e..,si,mu,lation of ,buffgr Tpemory.system of the
smaller bu.ff.9r., capaqlties 1.p .qp.rppaptson wil h. the simy lation mod.el
discussed in [ 8.]. Frg.m, Vhe rt $pl.t,s.pne, cqn, seg that Yhe access-
rates to buffer memory are about 50 % if buffer. capacity is.as small.
as 4(w・rd/blgqk)X 2(b↓・cks・,)1aρdゆt the#rv手at、ヰ。n. qf the access卿
rates qmong tesV progFams i.s ,gpexpecte.qly smqll,.i.f .Phe buf.fer.
capacity is sma.1.lr.
Acknowledgmen’ヒ
The authors ・ヒhank Nobuhiro Hayashi for implementing OPT
simula・ヒ6r, and Masanori :Kanazawa and Noriko 工ida for ’ヒheir helpful
comments and criticism. This work was supported in part by Proゴect
of Data Processing Cen’ヒer, 1くyoto University.
[ 17 ]
References
l) Brawn,B.S. and Gustavson,F.G. Program behavior in a paging
environment r AF工PS Conference Proceedings, :FJCC, Vol.33,
pp.iO19-1032, 1968.
2) Denning r P.J. The working set model for program behavior,
COmm・ Of the A. CMr VOI・Zi, NO.5, PP.323-333r Z968・
3) Freibergs,工.F. The dynamic behavior of programs, AFIPS
Conference .Proceedings, FJCC, VoZ.33, pp.1163-l167, 1968.
4) Joseph,M. An analysis of paging and program behaviour,
Compu’ヒer Journal, Vo1.13, No.1, pp.48-54, 1970.
5) Mat・ヒson,R.工、., Gecsei,J.,Slutz,D.R. and Traiger r l.1・.
EValuation techniques for s’ヒorage hierarchies’
工BM Systems Journal, Vol.9,’No.2, PP.78-117, 1970.
6) Sisson.,S.S. and Fiynn,M.J. Addressing patterns and memory
hahdling l alqorithms, A:F工PS Conference Proceedings, FJCC,
Vol.33, pp.957-967, 1968.
7) Liptay,J.S. Structural aspects of the System/360 Model 85
工工 The cache, 工BM Sys’ヒems Journal, Vol.7, No.1, pp.15-21, :L968.
8) Nakamura,T., Kitagawa r H., Kanazawa,M. and Hagiwara,H.
Simulation df a computer system with buffer memory, E工S一・TR-76-2’
University of Tsukuba, 1976.
INSTITurE OF ELECTRON I CS・AND I NFORIVIAT I ON SC I ENCE
UN I VERS I rv OF TSUKUBA
SAKURA-MURA, N I I HAR I-GUN, I BARAKI JAPAN
REIPORT DO( UIVENTAT I ON PAGEREPORT NU]Y[BER
TR-76-3
田ITLE
An Analysis .of Program Behavior
AUTHOR(s)
Tomoo Nakamura (工nstitute of Electronics and 工nformation
Science, University of Tsukuba)
Haゴim6 Kitagawa(Data Pr。cessing Centerr’Kyot。 University)
HirOshi Hagiwara (:Faculity of Engineering, Kyo’ヒ。 University)
REPORT DATE
July 7
MA工:N CA工EGO:RY
Computer Systems
NUMBER OF PAGES
18
CR CATEGORIES
6.2
KEY WORDS
memory hierarchy, buffer memory, program behavior
ABSTRACT
As computer systems have increased their co血plexi’ヒy r
it has become very important to analyze the systemefficiency. 工n the analysis of compu’ヒer systems, it isEairly of use to have the knowledge of dynamic behaviorof programs under execution. This paper is an empiricalstudy of program behavior. We prepared an addres.s tracerwhich recorded the memory reference pat’ヒerns of programsand qot ・ヒhe traced data for several programs・ By us■ngthese da・ヒa we analyzed some kinds of proqram behavior=run length; branch distance; memory demand 7 stackdistance frequency for typical replacement algorithms(LRU and OPT), instruction buffering. In ’ヒhis pape:rwe presen’ヒ the resul’ヒs of the measurement and analysis.
SUPPLEMENTARY NOTES