A DATA DEFINITION AND MAPPING LANGUAGE FOR ...

A DATA DEFINITION AND MAPPING LANGUAGE FOR NUMERICAL DATA BASES 

Ola-Olu A. Dainl and Peter Scheuermann 

Electrical Engineering and Computer Science Department 

Northwestern University 

Evanston, Illinois 60201 

Abstract 

Numerical data bases arise in many scientific 

applications to keep track of large sparse and 

dense matrices. Unlike the many matrix data storage 

techniques available for incore manipulation, 

very large matrices are currently limited to a few 

compact storage schemes on secondary devices, due 

to the complex underlying data management facilities. 

This paper proposes an approach for generalized 

numerical database management that would promote 

physical data independence by relieving users 

from the need for knowledge of the physical data 

organization on the secondary devices. 

Our approach is to describe each of the storage 

techniques for dense and sparse matrices by a 

physical schema, which encompasses the corresponding 

access path, the encoding to storage structures, 

and the file access method. A generalized 

facility for describing any kind of numerical database 

and its mapping to storage is provided via 

nonprocedural Stored-Data Description and Mapping 

Languages (SDDL and SDML). The languages are processed 

by a Generalized Syntax-Directed Translation 

Scheme (GSDTS) to automatically generate FORTRAN 

conversion programs for creating or translating numerical 

database from one compact storage scheme 

to another. The feasibility of the generalized approach 

with regard to our current implementation 

is also discussed. 

I. Introduction 

The problem of storage representation for 

dense/sparse matrices in main core, in order to 

optimize storage costs or processing time, has received 

considerable attention in literature [6,10, 

* This research is supported by the University 

of Ife, lle-lfe, Nigeria. 

**On study leave from Computer Science Department, 

University of Ire, Nigeria. 

Permission to copy without fee all or part of this material is granted 

provided that the copies are not made or distributed for direct 

commercial advantage, the ACM copyright notice and the title of the 

publication and its date appear, and notice is given that copying is by 

permission of the Association for Computing Machinery. To copy 

otherwise, or to republish, requires a fee and/or specific permission. 

©1980 ACM 0-89791-028-1/80/1000/0418 $00.75 

12]. A variety of compact storage schemes have 

been developed and facilities for incore data manipulation 

using these schemes are available in a 

number of the software packages currently in use 

at any computing center. However, only a few matrix 

compact storage schemes are currently being 

implemented for the manipulation of large dense or 

sparse matrices residing on secondary devices and 

these are not readily available [3,8,9]. This is 

due to the fact that some of these methods employ 

quite complex data structures, such as threaded 

linked lists [Ii], which require complex programs 

for their implementation on secondary devices. 

In addition, there is also the added difficulty to 

an application user in accessing the compact matrix 

data residing on secondary devices. 

Numerical data bases refer to data bases necessary 

to process numerical applications, that are 

residing on secondary storage devices in matrix 

compact storage forms. A numerical application 

database may consist of from one to three interrelated 

set of files because pseudo data e.g. distance 

from the diagonal and row beginning in the 

data item vector, is usually kept on separate 

files from the data item file. In addition, the 

set of files may also be processed by different 

file access methods e.g. sequential for pseudo data 

file, and indexed sequential or direct for the 

index and data item files. 

While there recently have been important advances 

in the use of very large data bases in commercial 

applications, little has been done in the 

area of numerical applications because the current 

facilities of database management systems (DBMS) 

are not suitable for processing numerical data 

bases in the majority of the matrix compact storage 

schemes. In order to address this problem, 

there is a need for a generalized approach to numerical 

database management whereby the numerical 

application users have facilities for data definition 

and mapping as well as data access to numerical 

data bases in any matrix compact storage 

scheme by means of simple hlgh-level nonprocedural 

languages that relieve them from the need for 

knowledge of low-level details of physical implementation. 

The main advantage of data definition and 

mapping facilities is that the information that 

usually resides in an application program on any 

storage structure is removed into a schema which 

provides information on the physical storage or- 

418

ganization and its mapping interface to the operating 

system such that the user only provides infonm 

ation about logical data descriptions. These facilities 

are usually provided by a data definition 

language (DDL) or by stored-data description and 

mapping languages (SDDL and SDML) [7]. Similarly, 

data access facilities are provided by a data manipulation 

language (DML) which promotes physical 

data independence. 

Our investigation of data language facilities 

reported in [2,7,13,17,18] reveals that none 

are suitable for numerical data management, which 

usually requires different kinds of indexing and 

ordering capabilities. Therefore, we have designed 

the data language facilities (SDDL, SDML, and 

DML) which can provide a generalized approach to 

numerical database management. In view of the 

limited number of compact storage schemes currently 

in use for numerical database management, we 

are implementing a generalized data translator 

that will automatically restructure any numerical 

database from one compact storage scheme to another 

by means of SDDL and SDML facilities. This 

satisfies an important goal of data portability, 

and in addition the methodology developed for the 

data translator is an essential part for the support 

of a DML, which will be implemented in the 

second phase of our project. 

Our current approach provides the following 

features: 

i. Each dense or sparse matrix compact storage 

scheme can be described by a physical 

schema, which comprises the corresponding 

data access path, the encoding to storage 

structures and the file access method. 

2. A generalized facility for describing any 

kind of numerical database and its mapping 

to secondary storage, i.e. the physical 

schema, is provided via nonprocedural 

Stored-Data Description and Mapping Languages 

(SDDL and SDML). 

3. A generalized data translator that will 

enable application users to create or to 

restructure their numerical database from 

one compact storage scheme to another, by 

supplying the SDDL and SDML statements of 

the source and target database descriptlons. 

We begin by describing some relevant concepts 

from numerical analysis and DBMS in Section 2. 

Next, the numerical physical schemas and the SDDL 

and SDML facilities are described in Sections 3 

and 4. The feasibility of the SDDL and SDML in 

numerical database management and their implementation 

by a Generalized Syntax-Directed Translation 

Scheme (GSDTS) as part of our generalized data 

translator is discussed in Section 5. 

2. Overvlew of Numerical Analysis and DBMS 

Concepts 

Numerical data are usually generated in both 

quantitative and qualitative problem solving operations 

in the social sciences, physical sciences, 

engineering, etc. Numerical application data usu- 

ally corresponds to dense or sparse matrices, and 

any such data necessary to process a numerical application 

which is residing on secondary storage 

is called here a nmuerlcal database. We discuss 

matrix features which provide guidelines towards 

minimization of storage space and storage data 

representation as well as DBMS concepts, such as 

schema and the data language facilities, which enable 

our generalized approach. 

2.1. Dense and Sparse Matrix Compact Storage 

Schemes 

Two major types of matrices, dense and sparse 

matrices, will be considered. A dense matrix has 

a high proportion of nonzero elements, while a 

sparse matrix has a few nonzero elements. The two 

basic features for promoting compact matrix storeddata 

are symmetry and bandwidth. Different compact 

storage schemes for synmnetrlc and band matrices 

as well as several other sparse matrix indexing 

schemes are identified in literature [6, 

9-12]. These compact storage schemes are described 

by the corresponding numerical physical 

schemas in our generalized approach, as is described 

in section 3. 

2.2. Schema 

The term schema was originally coined in connection 

with the logical database description, 

i.e. the definition of the objects, roles and 

properties of interest to a given enterprise. The 

term was first brought into usage by the CODASYL 

Database Task Group [4,5]. However, it is now 

used in a broader sense to stand for data descriptions 

in database systems at the logical or physical 

level. Since for our numerical databases the 

logical structure is relatively simple, the role 

of the physical schema which describes the mapping 

to storage becomes predominant. Each type of 

matrix data organization, such as a square, lower 

triangular or band matrix, could be viewed as 

corresponding to a logical schema, while any compact 

storage scheme can be viewed as a storage 

model with a corresponding physical schema. The 

physical schema describes completely the mapping 

to storage in terms of: (I) access path organization, 

(2) encoding of storage structures, and 

(3) operating system accessing methods [15]. 

2.3. Data Languase Facilities 

Data definition and mapping facilities are 

important features of a DBMS which support the 

concept of data independence. These facilities 

are provided either in the form of a self-contained 

language llke the data definition language 

(DDL) or as two languages which are a stored-data 

description language (SDDL) and a stored-data 

mapping language (SDML). A DDL is generally a 

declarative language for specifying logical data 

structures and a data mapping language specifies 

the mapping of the logical data structure to the 

storage space. 

The Database Task Group in [5] proposed a 

schema DDL as a language for defining a data model 

together with its mapping to storage so that 

it would meet the requirements of many distinct 

progran~ning languages. Another CODASYL group, 

419

the Stored-Data Definition and Translation Task 

Group [7], proposed a stored-data and data translation 

model and language for describing and translating 

among a wide class of logical and physical 

structures. Additional data definition and mapping 

languages have been proposed, with prototype 

implementations, for database reorganization, e.g. 

[2,13,17,18]. The language facilities are usually 

designed for operating on the traditional database 

schemas of relational, hierarchical and network 

data models. 

The matrix compact storage schemes which represent 

our model cannot be suitably defined using 

the data language facilities mentioned above because 

of the requirements for different kinds of 

indexing and data ordering capabilities. Therefore, 

we decided to develop nonprocedural storeddata 

description and mapping languages (SDDL and 

SDML) which provide a generalized approach for 

describing and mapping any numerical database to 

secondary storage. The two languages are discussed 

in section 4. 

Another important feature of a DBMS is a data 

manipulation language (DML) which provides the interface 

between the application users and the DBMS 

via a set of higher-level commands. We have designed 

a DML which contains commands embedded in 

FORTRAN, corresponding to the operation performed 

on numerical databases. However, the DML will not 

be discussed further, since its implementation 

will be considered only in a future project. 

3. Numerical Physical Schemas 

As we mentioned previously, the various storage 

techniques for dense and sparse matrices suggested 

in literature can be represented by a corresponding 

physical schema, which depicts not only 

the access path, but also the encoding of storage 

structures and the file access method. In order 

to generalize the description of the physical 

schemas, we investigated their access paths for 

similarities. Our investigation reveals three 

groups which have direct, indirect and linked access 

paths respectively. The direct access path 

corresponds to dense array realization, the indirect 

to the technique of going through an index to 

access a data item (non-zero element) and the 

linked to the technique of accessing a data item 

through other data items connected to it by pointers. 

Formal definitions of the access paths will 

be presented later. 

Since in our case the access paths are closely 

related to the actual encodings of the storage 

structures, which specify mappings into a linear 

address space [15], we identify the groups as direct, 

indirect and linked encodlngs respectively. 

We shall assume that in our approach the linear 

address space refers to storage space on secondary 

devices. 

3.1. Direct Encoding Group 

Numerical physical schemas in this group describe 

compact storage schemes for dense matrices. 

Their logical schema comprises the dense m x n, 

lower-/upper-triangular or band matrices, and 

their storage is either an m x n matrix or a vect- 

or. The stored-data organization is in row or 

column major order and the access path is direct. 

Each of these storage schemes requires a single 

external file and those with a non-synmaetrlc dataset 

are usually processed by a sequential file access 

method. But indexed sequential or direct file 

access methods may be appropriate for symmetric 

matrices in order to reduce the access time involved 

in reconstructing the data items for a row/ 

column. We identify the following storage schemes 

in this category (albeit, close to their logical 

counterparts). 

i. address-polynomial (regular m x n matrix) 

2. lower- or upper-trlangular 

3. symmetrlc-band 

4. nonsynmnetrlc-band 

An illustration of one of theschemas is shown 

below in Figure I. 

42000 

35600 

01430 

00917 

00023 

Source 

Dataset 

Logical 

Schema 

1356 

1143 

~917 

230 

Storage 

Scheme 

Figure i - Dense nonsymnetric-band matrix data 

structure. 

The group's access path is direct because the 

search technique uses computed-access array storage 

mapping which is defined as follows [14]: 

Definition: Let N denote the set of positive integers 

and A be a two-dimenslonal array scheme. A 

computed access storage mapping for A is a total 

function f: N x N 4 N such that: (I) f(l,l) = I, 

and (2) f is one-to-one on array scheme A. 

3.2. Indirect Encoding Group 

This group of numerical physical schemas describe 

the storage structures for all the sparse 

matrix indexing techniques whose access paths include 

reference data separately from the data-items 

themselves. Their logical schema is a m x n 

sparse matrix or a lower/upper diagonal matrix. 

Their storage scheme consists of vectors of data 

items, i.e., the non-zero elements, in row or columnmaJor 

order, with corresponding row and/or 

column indices and/or reference data. Reference 

data, i.e. pseudo data, refers to the location of 

data items within the source matrix; row/column 

beginning in the data item vector; or distance 

from the diagonal. These schemas usually require 

interrelated sets of two or three files respectively 

and their choice of file access method depends 

on the type of expected row/column retrieval. 

For sequential row/column retrieval, a sequential 

file access method is adequate; for random row/column 

retrieval, we can choose either indexed sequential/dlrect 

for all files or a combination of sequential 

for reference data file and indexed/dlrect 

for index and data item files. The schemas 

we identify in this group are: 

420

I. slngle-lndexlng 

2. double-lndexlng-I (row-column-I) 

3. double-lndexlng-2 (row-column-2) 

4. blt-map 

5. address-map 

An illustration of one of the schemas is shown 

in Figure 2. 

1234 i 

i 0 0 2 3 1 ~ ~i Row beginning in 

0 0 3 0 data item vector 

0400 

50 I 2 ~ 143 2 I 34 ~ j Column index 

i ! vector 

Logical 

Schema 

123 4567 ~j 

i 

I 12 3 4 5 1 2 Data item 

I 

vector 

123 4567 M(i,J) 

Storage Scheme 

Figure 2 - Double-indexing-2 (Row column-2) 

Their access path is indirect because the 

search technique uses a composite storage mapping 

which may be defined by the following [ii]: 

Definition: Let i and j represent the row and column 

data item subscripts; M(i,J)--data item location; 

~.--beginnlng relative address of indices 

i 

for row i; and n.--relatlve address of element 3 in 

column index vector as illustrated in column Figure 

2. Data ordering is assumed rowwise, for columnwlse 

ordering we can just interchange i and j. 

Let f represent any storage mapping function such 

that f(1) = ~i" A search function, ~f, is defined 

as follows: 

(~f) (j,~i) = ~j, iff f(~j) = j; 

and V ~j' s.t. 

! 

~i ~ ~j < ~j' f(~j' ) ~ j" 

= @, iff V ~j ¢ N, f(~j) ~ j. 

A composite mapping function, h, on a search function, 

~. is defined as follows: h(~f(j, f(i))) = 

M(i,j). r 

3.3. Linked Encodln~ Group 

The linked encoding group consists of numerical 

physical schemas for all the sparse indexing 

schemes with linked llst data structures. 

Their logical schema is the m x n sparse matrix 

and their storage scheme consists of lists of nodes. 

Each node has a format which might consist of data 

item, row and column indices and pointer fields. 

The schemas usually require a single file with indexed 

sequential or direct file access method. 

These schemas are further classified as: 

I. llnear-llnked-llst 

2. doubly-llnked-list 

3. threaded-linked-llst 

Figure 3 shows an illustration of such one of them. 

Their access path is called linked because the 

search technique uses a mapping defined through 

pointer linkage. 

It may be defined as follows: 

Definition: Let D = (X, R) be a storage structure 

with nodes xl, ..., x_ and relations (r,, ro) c R 

such that rl-represen~s a row equlvalen~e relatlon 

and r 2 represents a column equivalence relation. 

In adaltlon, let ~x I represent the address of node 

x.; k.x--the value of ith pointer field of node x 

l l r 

i.e, row pointer value; k4x--value of Jth polnte 

field of node x i.e., col6mn pointer value; X/rl-- 

row equivalence class and X/r2--column equivalence 

class. A linked mapping is a linked realization 

of a relation from the header pointer node, if at 

least one of the following holds: 

I. The relation r I is realized as a linked 

structure (rel~tive to the ith pointer 

field) i,e., for every pair of nodes 

(x.,i x^)sz X/rl' ~xp ¢ k~x I holds, or 

similarly r 2 Is realizes ~s a linked 

structure. 

2. If for every ordered three nodes such that (xl, x~) c X/r. and (~., ~.) 

¢ X/r2, ~x2¢ ~ix~ and nx 3 ¢ ~jx I hol~. 

In addition, it is possible that the relation r is 

realized as a linked structure and the end node x 

points to the header node x', i.e. kx n = ~x'. n 

4. Data Lansuage Facilities 

The data language facilities provide a generalized 

approach for describing any numerical database 

and its mapping to storage. They consist 

of a stored-data description language (SDDL) and a 

stored-data mapping language (SDML). The two 

languages are similar to other data definition and 

mapping languages [7,17,18]. We have attempted as 

much as possible to make them user friendly, by 

including simple, self-explanatory language const, 

ructs. The choice of only one of the alternatives 

is represented by [] (braces) and an optional 

phrase by [] (square brackets). Language keywords 

appear in capital letters and user-defined words 

in lower case. Sample SDDL and SDML statements 

of both source and target numerical databases are 

shown in Figures 4 and 4.1 respectively. Other 

features of the two languages will be revealed as 

they are described below. 

4.1. Stored~Data Description Language (SDDL) 

The SDDL is intended mainly for the user to 

describe the logical characteristics of his numerical 

database and the associated type of file organization 

on secondary storage devices, or alternatively 

the card input-fornlst. Therefore, the 

language is divided into three parts which are 

(I) matrix structure, (2) file control, and (3) 

input format. 

The matrix structure describes the logical 

characteristics of the data and it also indicates 

if dynamic storage management is required. The 

basic matrix format is specified using the selfexplanatory 

keywords: ~DENSE ~ {SYMMETRIC ~,and 

~SPARSEy~ONSYMMETRIC 3 

BANDED ~. If the matrix is symmetric, the 

ONBANDEDJ statement will include~UPPER-DIAGONAL~ 

~LOWER-DIAGONAL~ 

in order to specify the partition of the dataset 

421

i 0 0 2 

0 0 3 0 

0 4 0 0 

5 0 I 2 

Logical Schema 

 

D---~iII, I i1115121 

E~ 

:"I 1412171oi 

1 

[i]------~ I, I ,I~I°I 0L 

Storage Scheme 

! 

-~L' 71 ~ b121o Ioi 

Node 

Format 

Node Row I Column I Data 

Key Index Index Item 

Column 

Node 

Pointer 

Row 

Node 

Pointer 

Figure 3. 

Doubly-llnked-llst 

422

DATA-DESCRIPTION: 

MATRIX-STRUCTURE: 

TYPE = SPARSE, NONSYMMETRIC, STATIC: 

FILE-CONTROL: 

TYPE 

= SOURCE; 

FILE-UNIT = 21, 22, 23; 

MEDIUM = DISK; 

RECORD: REC-KEY = integer; 

SIZE = 512, FIXED, UNBLOCKED; 

DATA-MAPPING (double-indexing-2); 

ACCESS-PATH-ENCODING: 

ACCESS-PATH = INDIRECT-ENCODING 

(REF-DATA-ORG); 

INDIRECT-ENCODING: 

REF-DATA-ORG: (REF-ORG-i, 

REF-ORG-2, DATA-ORG); 

REF-ORG-I: SET(LOC); 

LOC: integer, TYPE = ROW BEGIN- 

ING; 

REF-ORG-2: SET(INDEX); 

INDEX: integer, TYPE = COLUMN 

INDEX; 

DATA-ORG: DIMENSION = (5000,5000); 

ORDERING = ROWWISE; 

SET(DATA-ITEM); 

DATA-ITEM: real, REAL-PRECISION 

= DOUBLE; 

ENCODED-FILE: 

FILE-NAME = datfile,lndfile,locfile; 

ORGANIZATION = RANDOM,RANDOM, SEQUENTIAL; 

ENCODED-DATA = DATA-ORG, REF-ORG-2, 

REF-ORG-I; 

Figure 4. 

Sample SDDL & SDML statements of a 

source numerical database for a 

double-lndex-2 schema. 

to be processed. Similarly, a bandwidth statement 

which specifies the size of the band is required 

for a band matrix and a density statement giving 

an estimated density of a sparse matrix is necessary 

for creating a database with random file organization. 

Some statements in the matrix structure 

section are shown in the example below. 


TYPE = SPARSE, BANDED, SYMMETRIC, LOWER- 

DIAGONAL, STATIC; 

BANDWIDTH = (250, 250); 

The file control specifies the file organization 

of a numerical database already residing 

on a secondary device or to be created, by listing 

the type of file, device medium, file unit etc. 

The file control statements depend on the device 

medi~m~ selected for processing as specified by the 

device medium keyword, CARD, TAPE, or DISK. If data 

is to be processed from card input stream, only 

the file-type, file-unit and device-medlum statements 

are required, but in addition to these three 

statements, both disk and tape files require record 

statements. 

The file-type statement identifies the source/ 

target file and the file-unlt statement gives a 

set of FORTRAN READ/WRITE unit numbers for processing 

the files in the database. The record statement 

lists the record properties llke record-slze, 



TYPE = SPARSE, NONSYMMETRIC, STATIC; 

FILE-CONTROL: 

TYPE = TARGET; 

FILE-UNIT = 4; 



SIZE = 1024, FIXED, UNBLOCKED; 

DATA-MAPPING: (doubly-linked-list); 

ACCESS-PATH-ENCODING: 

ACCESS-PATH = LINKED-ENCODING: 

(LINKED-DATA-ORG); 

LINKED-DATA-ORG: (COL-HEAD-NODE, 

ROW-HEAD-NODE, 

DATA-ITEM-NODE); 

COL-HEAD-NODE: (PTR-ITEM,FIELD- 

LINKAGE); 

PTR-ITEM: integer, TYPE = COL PTR; 

FIELD-LINKAGE = FIRST COL NODE; 

ROW-HEAD-NODE: (PTR-ITEM, FIELD- 

LINKAGE) ; 

PTR-ITEM: integer, TYPE = ROW PTR; 

FIELD-LINKAGE = FIRST ROW NODE; 

DATA-ITEM-NODE: (KEY-FIELD, ROW-FIELD, 

COL-FIELD, 

DATA-FIELD, COL-PTR- 

FIELD, ROW-PTR-FIELD); 

KEY-FIELD: NODE-KEY = integer; 

ROW-FIELD: REF-ITEM = INDEX; 

INDEX: integer, TYPE = ROW 

INDEX; 

COL-FIELD: INDEX: integer, TYPE= 

COL INDEX; 

DATA-FIELD: ORDERING = NONE; 

DATA-ITEM = real, REAL- 

PRECISION; 

REAL-PRECISION = DOUBLE; 

COL-PTR-FIELD: PTR-ITEM, FIELD- 

LINKAGE; 

PTR-ITEM: integer, TYPE = 

COL PTR; 

FIELD-LINKAGE = NEXT COL NODE; 

ROW-PTR-FIELD: PTR-ITEM, FIELD- 

LINKAGE; 

PTR-ITEM: integer, TYPE = 

ROW PTR; 

FIELD-LINKAGE = NEXT ROW NODE 

ENCODED-FILE: 

FILE-NAME = NODFILE; 

ORGANIZATION = RANDOM; 

ENCODED-DATA = SET(LINKED-DATA-ORG); 

Figure 4.1. 

FIXED 

Sample SDDL & SDML statements of a 

target numerical database for a 

doubly-llnked-llst schema. 

IBLO KED 

VARIABL~and[UNBLOCKEDJ- In addition, the file 

control section may include any of the following 

optional statements: (I) a record-key statement 

to specify either integer or alphanumeric key 

for random file organization; (2) a block-size 

statement required for blocked records; and (3) 

a format statement (similar to FORTRAN) for formatted 

records. Some of these statements are 

illustrated under FILE-CONTROL in figure 4. 

423

The input-format section provides facilities 

for processing unstructured database from cards. 

The section is comprised of the dimension, the data 

ordering and format statements respectively. 

The dimension statement, shown below, specifies 

the numbers of 

DIMENSION= SROW ~, integer,~COLUMN~, integer; 

COLUMN) 

[ROW 

rows and columns in the matrix. The data ordering 

statement specifies a rowwise/columnwlse/none ordering. 

The data-format statement: 

(SRARSE- YPE-q 

DATA-FORMAT=~SPARSE-TYPE-21; 

(DENSE 

J 

gives users three choices of format specifications. 

Both SPARSE-TYPE-I and SPARSE-TYPE-2 are for sparse 

matrix input format specifications of only nonzero 

elements and the DENSE is for all the matrix elements. 

SPARSE-TYPE-i is for an ordered input data so 

that a row or column input data stream is processed 

at a time. As shown below, 

SPARSE-TYPE-i: CONTROL-DATA = ~ROW ~ data-type; 

ICOLU~NJ ' 

FORMAT = SET(data-type, 

data-type); 

it requires a control data to specify the row or 

column to be processed so that the format becomes 

a set of pairs of column/row and data item datatypes. 

A data-type is any valid FORTRAN format 

specification for spacing, alphanumeric, integer 

or real variable e.g. 5X, 16, FIO.4 and E20.12. 

SPARSE-TYPE-2 is for an unordered input data 

so that the format is a set of row, column, and 

data item data-types as follows: 

SPARSE-TYPE-2 = SET([ROW], data-type, 

[COLUMN], data-type, 

data-type); 

Finally, DENSE = SET(data-type); provides for 

a set of regular FORTRAN-type format specifications. 

An example of a SPARSE-TYPE-I input format is shown 

below. 

INPUT-FORMAT: 

DIMENSION = ROW, 5000, COLUMN, 5000; 

ORDERING = ROWWISE; 

SPARSE-TYPE-l: CONTROL-DATA = ROW, 14; 

FORMAT =5(14,2X,FI0.6) ; 

4.2. Stored-Data Mapping Language (SDML) 

The SDML has two functions: (i) to describe 

the different types of mapping which the 

system can make between a logical schema and a 

target storage space, and (2) to describe the encoding 

to storage structures. The major structure 

of the language is comprised of the access path encoding 

and the encoded file. The major emphasis 

of the language is on the access path encoding, 

which represents the most difficult part of the 

mapping description. The encoded file section enables 

the assignment of encoded data (data items 

and pseudo data) to the files in the database according 

to the corresponding definitions of filenames 

and file accessing methods. 

selection of an appropriate mapping subsection and 

relates its subsections to the mapping descriptions 

of the direct, indirect and linked schema 

encoding groups. Reference to mapping descriptions 

defined in one encoding group by another is 

a colmnon feature of the language, e.g. REF-ITEM 

definition of pseudo data in the indirect encoding 

subsection is referenced by the linked encoding 

subsection. 

The direct encoding, implied by the DATA-ORG: 

subsection, describes the data item with its properties 

llke data ordering and type. It also provides 

for an optional definition of dimension and 

bandwidth for a source database description. The 

indirect encoding provides a choice of mapping alternatives 

for encoding pseudo data and data item 

to separate encoded files by the mapping descriptions 

identified by MAP-ORG: and REF-ORG: (see 

Figure 4). In addition, an ordered combination 

of pseudo data and data items may be mapped to an 

encoded file by MIXED-ORG: mapping description as 

follows: 

MIXED-ORG: SET ~RDERED~(REF-ITEM, DATA-ORG)~. 

~(REF-ITEM, REF'ITEM,~r 

~ DATA-ORG) JJ 

The linked encoding enables the mapping of 

any set of nodes to an encoded file. Each node 

is identified by a user defined node-name and 

consists of a set of fields. Each field is described 

by an optional field-name and a field identifier 

which may be a node key, pseudo data, or 

data item. An example of linked encoding mapping 

is illustrated in Figure 4.1. 

The mapping description consists of definitions 

of both primitive and nonprimitive data 

structures. The representation of structures of 

primitive type is usually by an assignment statement, 

while that of nonprimltive is by a descriptive 

statement consisting of a set or group name, 

and a set or group definition [16]. We provide 

the following constructs in the language to specify 

data, ordering and linkage definitions: 

i. ordering definition types--rowwise, collumnwise 

and none; 

2. basic data types--integer, real, and alphanumeric; 

3. linkage definition types--header, first, 

next, prior, last, row, column, node, 

field, and null. 

A valid and meaningful linkage definition, 

except the NULL keyword, requires an ordered combination 

of the following: (I) a pointer linkage 

keyword, (2) row or column, and (3) node or field. 

The pointer linkage keywords are header, first, 

next, prior, and last. An example of a valid 

definition is FIRST ROW NODE. 

is: 

An example of a 

primitive type data structure 

integer 

DATA-ITEM = ~ real ~ ; 

L alpha 3 

The access path encoding section enables the 

424

An example of a nonprlmltive type data structure 

illustrating a SET definition is: 

DATA-ORG: 

[ROUSE 

SET(DATA-ITEM), ORDERING=~COLUMNWISE|; 

(.NONE .2 

A primitive type data structure which is semantlcally 

ambiguous, e.g. index and pointer, becomes 

a nonprlmltive structure by qualifying the 

basic data definition with a semantic phrase definition 

as follows: 

INDEX: ~integer~ , TYPE =[ROW INDEX 

Lalpha J ~COLUMN INDEX ~ ; 

]CONCAT(ROW INDEX,] 

£COLUM~ INDEX) J 

An access path is described by ORDERING and 

LINKAGE phrases. ORDERING describes the matrix 

data access path by row, column or none. It is assumed 

that the ORDERING of reference items, i.e., 

indices and locations (within the matrix or from 

diagonal elements) corresponds to that of matrix 

data items. LINKAGE describes linked llst structure 

connectivity by a combination of linkage keywords 

as in the following example: 

PTR-ORG: SET(PTR-ITEM), LINKAGE=NEXT COLUMN FIELD; 

5. The Feaslbillty of SDDL and SDML in a Numerical 

Database System 

The current approach to numerical database 

management is restricted to a few matrix compact 

storage schemes. The most cmmnon compact storage 

scheme for processing sparse matrices residing on 

secondary devices is the double-lndexlng (rowcolumn) 

technique, but this is not the best technique 

for many applications. A few research 

groups, e.g., [9], have tried the linked llst 

technique for programs tailored to their applications; 

however, they are not always available for 

public distribution. 

Our investigation of the implementation of 

a generalized approach to numerical database management 

reveals two basic requirements. The 

first requirement is for the numerical database to 

reside on secondary storage using the storage 

scheme that is best fitted for its application. 

The second requirement is to provide tools for 

data access that will promote physical data independence 

through the implementation of a DML. 

It is obvious that the first requirement is 

a prerequisite to the second and that there are 

two options for its realization. The first option 

is for each user to be responsible for structuring 

his numerical database corresponding to the physical 

schema best suited to his application. This optlon 

is not practical because a user may not know 

how to structure his database to suit his objective. 

The second option is to have a generalized data 

translator that will automatically restructure any 

numerical database from one physical schema to another, 

or convert unstructured raw data not in a 

compact storage form, corresponding to a physical 

schema. It is essential for this option to be integrated 

into any effective generalized approach 

to numerical database management. 

Our first priority then is to develop a generalized 

data translator for numerical databases 

that will isolate the users from the underlying 

data management through stored-data description 

and mapping language facilities. 

5.1. A ~enerallzed Data Translator for Numerical 

Databases 

We are currently developing a generalized data 

translator for numerical databases as a first 

step towards developing a generalized numerical 

database management system. The generalized data 

translator is focused on the implementation of our 

nonprocedural Stored-Data Description and Mapping 

Languages (SDDL and SDML). Its function is to automatically 

create or restructure a numerical database 

from one schema to another in two consecutive 

processes of compilation and data translation 

(to be discussed later). Its input, supplied 

by the user, consists of the source and target 

SDDL and SDML statements (see Figure 4), and a 

source numerical database, Its output is the target 

numerical database. The overall functions are 

illustrated in Figure 5. 

During the compilation process, the user-supplied 

SDDL and SDML statements are converted by a 

lexical analyzer into a token stream which is 

translated by a Generalized Syntax Directed Translation 

Scheme (GSDTS) £nto FORTRAN source programs 

of the reader, the restructurer, and the writer 

subroutines. After compilation by a FORTRAN compiler, 

the subroutines become the major components 

of the translator subsystem. The translator subsystem 

also includes common data table information, 

shown in Figure 6, and utility functions and routines 

to compute mapping functions, e.g., synmnetrlc 

and band address locations, and to execute 

search and reordering algorithms. 

5.2. Data Translation Process 

The data translation process of the translator 

subsystem starts with the encoding of each record(s) 

of the source database into a translator 

internal form (TIF), followed by the decoding of 

TIF data to encoded record(s), and ending with the 

writing of record(s) on the storage devices. The 

components of the TIF are (I) the row/column identifier, 

(2) the index buffer for column/row index, 

and (3) the data item buffer for row/column data 

item. The translation process is controlled by 

the translation supervisor which activates the 

reader to encode the source database record(s) to 

TIF data, followed by the restructurer to decode 

the TIF data to encoded record(s), and then the 

writer to convert the encoded record(s) to physical 

record(s) and to wrlte it on the storage device, 

Each subroutine returns control to the supervisor, 

which activates the next subroutine accordingly, 

and the process is repeated until all 

the records of the source database have been processed. 

Figure 6.1 illustrates a data translation 

process of double-lndex-2 source database to doubly-llnked-llst 

target database. 

5.2.1. Reader Module 

The reader encodes both the unstructured matrix 

data, i.e., raw data not in any compact stor- 

425

age form, and the numerical database. In both cases, 

the information in the source file control 

table and either the input format or the physical 

schema table (see Figure 6) is used by the reader 

to read source data from cards or secondary devices 

and encode it into the translator internal 

form (TIF) data. The source data is processed by 

row/column according to the input format or physical 

schema specification. In order to produce the 

TIF data, each encode step ~f the translation iteration 

does the following: (i) fills in the appropriate 

row/column identifier, and (2) fills in 

the corresponding index and data buffers for that 

row/column (see Step la of Figure 6.1). For example, 

with row identifier equals I, we have I and 

4 in column index buffer, as well as I and 2 in 

data item buffer. On completion, control is re- 

turned to the supervisor for the next step of 

translation iteration, i.e., the decode step by 

the restructurer. 

5.2.2. Restrueturer Module 

If the source ordering is different from the 

target ordering, the TIF data of the entire database 

is temporarily stored in a workfile(s) to be 

reordered before it is decoded; otherwise, the TIF 

data is decoded into encoded data corresponding to 

the target schema as received. Each decode step 

of the translation iteration from the TIF data to 

a direct encoding group, dlslcards the index buffer, 

and reorganizes the data items to the appropriate 

encoded data. For the indirect encoding 

group, both the data items and the index which is 

I 

Source 

I 

SDDL & SDML 

Statements 

Lexical 

Analyzer 

Target 

|SDDL & SDML 

Statements 

I 

Lexical 

Analyzer 

I 

COMPILATION 

Token 

i 

Target 

Token 

GSDTS for SDDL and SDML 

/ 

FORTRAN 

Conversion 

Programs 

/ 

\ 

r- 

FORTRAN Compiler 

i 

TRANSLATION 

C 

NSu.t~rr~.ceall, ,% 

Database j 

TRANSLATOR 

Subsystem 

.( Target 

,~Numerical 

~.D_atabase 

> 

L 

Internal 

Form Data 

Figure 5. 

Usage and functions of the generalized data translator. 

426

SOURCE 

TARGET 

Control 

File I 

Table 

\ 

\ 

I Input 

Format 

I 

I 

\ I i 

f 

f 

f 

Physical 

Schema 

Table 

Physical 

Schema 

Table 

/ 

/ 

/ 

/ 

File 

Control 

Table 

I 

I 

I 

RESTRUCTURER 

WRITER 

1 

TRANSLATOR 

SUBSYSTEM 

< Tran81ator 

I, 

Target 

1 

Numerical 

Source 1 

Internal 

Numerical 

Database 

Database 

* Either Input Format--unstructured (raw) source matrix data. 

Or 

Physical Schema Table--source database in compact storage form. 

data descriptions 

data flow 

> processing sequence 

Figure 6. 

Major components of the translator subsystem. 

converted to the appropriate pseudo data, become 

the encoded data. However, the linked encoding 

group requires the supervisor to create null head 

nodes during initialization. Data item nodes with 

any appropriate pointers are created to form the 

encoded data at each decode step. For example, in 

Step Ib of Figure 6.1, two data item nodes for the 

first row are created to correspond to the TIF data 

in Step la. In addition, "i" in the row and column 

head nodes represents the pointer to the first data 

item node, and "2" in the column head and the first 

data item nodes respectively represents the column 

pointer to the second data item node. At the end 

of this step, control is returned to the supervisor 

for the last phase of the translation iteration 

i.e. writing the encoded data on the secondary devices 

by the writer. 

5.2.3. Writer Module 

The writer uses the information in the target 

file control table to open the file(s) of the target 

database during initialization and closes them 

after the entire database has been processed. It 

performs the last phase of each translation iteration 

by converting the encoded data into physical 

record(s) to be written on the secondary devices 

according to the user-deflned target file access 

method. For example, with regard to the encoded 

data of Step Ib in Figure 6.1, the head node records 

are updated records which are rewritten in 

place, and the data item node record is written 

as a new record on secondary device. On completion, 

control is returned to the supervisor for 

another translation iteration to begin with the 

reader. 

5.3. Compilation Process 

The compilation process is the sequence of 

operations necessary to automatically produce the 

reader, the restructurer, and the writer subrou- 

tine programs from the SDDL and SDML statements 

supplied by the user. Our investigation of automatic 

data conversion techniques [2,13,17,18] reveals 

tha= compiler-compiler techniques are generally 

used. In order to be able to perform a 

broad, useful and syntactically valid class of 

427

Source database of figure 2 

Step O 

1002 

0030 

0400 

5012 

Logical 

S ch ema 

II 3 4 5 0 0 0 0 1 

1 1 4 3 2 1 3 4 0 1 

II 2 3 4 5 1 2 0] 

Source record size = 4; 

Source file org. = sequential for all files. 

Row beginning file 

Column index file 

Data item file 

Target database of figure 3 

(Partial data description) 

Target record size = 14; 

No of row = 4; 

Target file org. = random; 

Translation Start 

Initialization Operation 

Create null head node records 

Buffer size = 4; 

No of column = 4; 

Record key = integer; 

Row-head node rec. [I [0 0 0 0 0 ..... 0 [ 

Step la 

Col-head node rec. 12 I0 0 0 0 0 ..... 

rec key 

ist Translation Iteration 

Source data to TIF (translator internal form) data 

Row identifier = I; 

01 

Step Ib 

Index buffer = I I 4 0 0 ~ Data buffer 

TIF data to Encoded Data 

Row-head node rec. ~I ~ i 0 0 0 0 ..... 

Col-head node rec. 12 | I 0 0 2 0 ..... 

Data-ltem node rec. [3l 1 l1 1 1 0 212ll 4 2 o 

Figure 6.1 

r~c ~ode n~de 

key key key 

An illustration of a data translation process. 

ii 2 0 0~ 

0# 

0J 

01 

translations, we decided that a generalized syntaxdirected 

translation scheme (GSDTS) is the best model 

for our application. Because FORTRAN is the 

progran~ning language of the majority of numerical 

application users, we decided to write the translation 

software in portable FORTRAN so that it can 

be of general distribution with little or no modification 

of the source programs from one computer 

system to another. 

A GSDTS requires an underlying LR(k) contextfree 

grammar. Therefore, we had to construct LR(k) 

gralmaars for our SDDL and SDML, and in order to 

minimize the compilation time, we have constructed 

SLR(1) grammars for the SDDL and SDML such that the 

terminal symbols are single digits or letters except 

the user-deflned variables and constants. 

The grammars and the LR(1) automatic parser generator 

which is used to validate them as part of the 

system initialization process are discussed below. 

A token stream of single digits or letters 

for keywords, and user-defined variables and constants 

is the output from the conversion of the 

SDDL and SDML statements by the lexical analyzer 

Eli. For example, "TYPE = SOURCE"; is converted 

to "I", "TYPE = TARGET"; becomes "2", "FILE-NAME 

= SAMPLE"; becomes "SAMPLE." The token stream is 

the input to the GSDTS which produces the source 

FORTRAN subroutine programs to be compiled by the 

FORTRAN compiler into object decks as the final 

output of the compilation process. 

An illustration of the compilation process is 

shown in figure 6.2. The SDDL statements of figure 

4 are input to the lexical analyzer. The 

statements are processed by the lexieal analyzer 

to produce an output token stream, which becomes 

an input to the GSDTS. The token stream is processed 

by the GSDTS in a concurrent operation of 

LR(1) parsing and semantic analysis. If no error 

is encountered during parsing and on successful 

428

eduction to the final state, the Semantic Analyzer 

outputs the generated FORTRAN statements. 

We will llke to mention that all data declarations 

are made in the Translator Subsystem so 

that the routines would have access to the common 

variables, even if there is an overlay operation. 

This explains why only the Translator Subsystem 

declarative statements are generated in figure 6.2~ 

because the Reader routine FORTRAN statements of 

a structured database are generated by processing 

the SDML statements. On the other hand, since an 

unstructured source database has no SDML statements, 

so in this case the Reader routine FORTRAN 

statements are generated along with the Translator 

Subsystem declarative statements by processing the 

SDDL statements. 

Input statemen t 

Conversion of SDDL statements 

5.3.1. SLR(I) Grammars for SDDL and SDML 

We have constructed one SLR(1) grammar for 

the SDDL such that terminal symbols for keywords 

are generally numerical codes with single letters 

wherever it is necessary to provide one unique 

lookahead symbol for consistency resolution. In 

order to maintain a modular programming approach 

and provide for execution time storage overlay 

should the need arise, we constructed two SLR(1) 

grammars for the SDML, which are one for the Direct 

and Indirect Encoding Sections, and another 

for the Linked Encoding Section with the Encoded 

File Section included in each grammar. The two 

SLR(1) grammars are similar to that of SDDL. 

The nontermlnals of the grammars are in selfexplicit 

BNF, e.g., , , 

of figure 4 to Tokens 

Token 



TYPE = SPARSE, NONSYMMETRIC, STATIC; 

FILE-CONTROL: 

TYPE = SOURCE; 

FILE-UNIT = 21, 22, 23; 

Token Stream - 

21N 22N 23N 

GSDTS Output - 



SIZE = 512, 

FIXED, 

UNBLOCKED; 

Output from Lexical Analyzer, 

Input to GSDTS. 

3 I 512 I 2 

FORTRAN Declarative Statements for the Translator Sybsystem 

INTEGER ROWID, COLID, BUFSZE, SDATOG, UPRCOD 

INTEGER RCOSTA, RECSZE, FLEUNT 

INTEGER DIAGID, DENSTY, FLENAM, BLKSZE 

DIMENSION INDROW(500), INDCOL(500), DATA(500), 

I 

INDEX(500),FLEUNT(3) 

DIMENSION DATBUF(500), INDUF(500) 

DIMENSION FLEUNT(3), FLEID(3), FLENAM(42) 

COMMON/GLOBAL/NOROW, NOCOL, ROWID, COLID, LWRCOD, 

i BUFSZE, IERROR, SDATOG, UPRCOD, DATBUF, INDBUF 

COMMON/ENCCOM/RCOSTA, INDPTR, KONTRL, RECSZE, 

I DATA, INDROW, INDCOL, FLEUNT 

DATA BUFSZE/500/ 

DATA FLEUNT(1), FLEUNT(2), FLEUNT(3) / 21,22,23/ 

DATA RECSZE,BLKSZE,RECKEY /512,0,1/ 

S 

21N 

22N 

23N 

3 

I 

512 

1 

2 

TRS20020 

TRS20040 

TRS20080 

TRS20100 

TRS20120 

TRS20140 

TRS20150 

TRS20160 

TRS20170 

TRS20210 

TRS20220 

TRS20310 

Figure 6.2 

An illustration of the Compilation Process 

429

and. One advantage 

of the modular SLR(1) gran~aar approach is that new 

features, llke additional pointer linkage definitions, 

could be added to the language with easy 

modification of the corresponding grammar. All 

the grammars have been proved to be SLR(1) by the 

LR(1) automatic parser generator. 

5.3.2. LR(1) Automatic Parser Generator 

The LR(1) automatic parser generator, developed 

by Wetherell and Shannon in [19], is written 

entirely in portable ANSI Standard FORTRAN 66 and 

it has been successfully operating on a number of 

computers. It generates a space efficient parser 

for any LR(1) grammar. It reads a context-free 

grammar in a modified BNF format and produces tables 

which describe an LR(1) parsing automaton. It 

has been used to validate our SDDL and SDML grammars 

and to produce the corresponding tables for 

describing their LR(1) parsing automata. The tables 

consist of dimension and data statements to be 

embedded into the LR(1) parser subroutines to be 

described later. The procedure is performed once 

as part of our system initialization operation for 

the development of the GSDTS--for the SDDL and the 

SDML to be discussed below. 

5.3.3. GSDTS--for the SDDL and the SDML 

Generalized syntax-dlrected translation 

schemes (GSDTS) are well defined in literature and 

we chose to implement a bottom-up execution of 

GSDTS [i]. The major components of the GSDTS-- 

for the SDDL and the SDML are, as illustrated in 

Figure 7, the following: (I) LR(1) parser, (2) 

LR(1) tables, (3) Semantic Analyzer, and (4) SDDL 

and SDML Semantic Tables. Its input is the SDDL 

and SDML token stream generated by the lexleal 

analyzer and assigned token values from LR(1) tables 

by the LR(1) parser's internal scanner. The 

outputs produced by the GSDTS are the reader, the 

restructurer and the writer FORTRAN source subroutines 

produced from the tokens of the source 

SDDL and SDML, the target 3DML, and the target 

SDDL respectively. 

The LR(1) parser is a set of subroutines 

which interpret the LR(1) tables to construct a 

parse of the SDDL and SDML token stream. Some of 

SDDL & / 

SDML Token 

LR(1) 

Parser 

1 

Tables 

ii s° 1 

Semantic [ 

Analyzer 

Rules 

[ 

GSDTS 

FORTRAN / 

Conversion 

Program 

Figure 7. 

GSDTS for SDDL and SDML. 

430

subroutines were part of the software developed 

by Wetherell and Shannon in [19], but they have 

been modified and tested to suit our application. 

We have developed three LR(1) parsers for the 

SDDL, the direct and indirect encodings, and the 

linked encoding SLR(1) granmaars respectively. 

The Semantic Analyzer consists of two major 

routines which perform the semantic analysis and 

the output production. The SDDL and SDML Semantic 

Tables contain the semantic rules corresponding 

to the SLR(1) grammar production rules. However, 

we are currently restricting our implementation to 

a few physical schemas which are representative of 

the three encoding groups, Therefore, the current 

semantic tables contain semantic rules corresponding 

to only those physical schemas, with null 

rules for the others so that they could be easily 

extended after the completion of the current development 

process. 

6. Future Directions and Developments 

In this paper, we have provided a model of a 

generalized approach for describing and mapping 

any numerical database to secondary storage by nonprocedural 

Stored-Data Description and Mapping 

Languages (SDDL and SDML). We have also shown how 

the DMBS concepts llke schema and data language 

facilities are also applicable to databases necessary 

to process numerical applications, which are 

residing on secondary devices. In addition, we 

have also discussed the feasibility of our model 

as a valuable tool in numerical database management 

as described in the current implementation 

of our generalized data translator for numerical 

databases. 

An area for the extension of thls research 

is in the implementation of a data manipulation 

language (DML). As previously mentioned, we have 

already designed a DML which consists of certain 

primitive statements that correspond to the operations 

permitted on the numerical database and embedded 

into FORTRAN. The file control and the 

physical schema tables, and some of the conversion 

utility subroutines of our model would be of use 

in the implementation of the DML at a later date. 

Another area of research is in the performance 

evaluation of the numerical physical schemas 

with regards to specific applications or numerical 

operations. MacVelgh has reported in [i01, the 

effect of data representation on the cost of 

sparse matrix operations in primary storage. It 

is desirable to extend this work to secondary storage 

and to develop a performance evaluation model 

for matching numerical database of an applicatlon 

to the best-fit physical schema on secondary storage. 

Finally, we would like to identify some physical 

schemas of our model that have currently 

proved to be of practical applications in numerical 

database management. The threaded-llnked-list 

structure has been successfully implemented in the 

WARDEN system in use at the University of Warwick 

[9] for Computer-Aided Design. Besides, secondary 

storage implementations that are similar to our 

direct encoding group, are identified in EASY-- 

an Engineering Analysis System of Utility Programs 

[8], while a row-column schema is used in Vectorized 

General Sparslty Algorithms with Backing 

Store [3]. Since the need for secondary storage 

backup is relative to the size of the primary 

storage, our model will be of great advantage in 

institutions with small or medium size computing 

facilities. 

REFERENCES 

I. Aho, A.V. & Ullman, J.D. "The Theory of Parsing, 

Translation and Computing, Volume II: 

Compiling," Prentlce-Hall, Inc., Englewood 

Cliffs, N.J., 1973. 

2. Bach, M.J., et al. "The ADAPT System: A Generalized 

Approach Towards Data Conversion," 

Proc. 5th Int. Conf. Very Large Data Bases, 

ACM, N.Y. Oct. 1979, pp. 183-193. 

3. Calahan, D.A., et al. "Vectorlzed General 

Sparslty Algorithms with Backing Store," Systems 

Eng. Lab., University of Michlgan, Ann 

Arbor, SEL Report #96, Jan. 15, 1977. 

4. CODASYL Data Base Task Group Report, Conf. 

Data System Languages, April 1971, ACM, New 

York, 

5. CODASYL Data Description Language Journal of 

Development, June 1973 Report. 

6. Duff, I.S., "A Survey of Sparse Matrix Research," 

Proc. of the IEEE, Vol. 65, No. 4, 

April 1977, pp. 500-535. 

7. Fry, J.P., et al. "Stored-Data Description 

and Data Translation: A Model and Language," 

Information Systems, Vol. 2(3), 1977, pp. 

95-147. 

8. Jensen, paul S., "An Engineering Analysis System," 

Proc. ACM 1978 Annual Conference, Washington, 

D.C., Vol. I of 2, Dec. 4-5-6, 1978, 

pp. 490-495. 

9. Larcombe, M.H.E., "A List Processing Approach 

to the Solution of Large Sparse Sets of Matrix 

Equations and the Factorlzation of the 

Overall Matrix," Proc. Oxford Con f. on "Large 

Sparse Sets of Linear Equations," J,K. Reid , 

Editor, April 1970, Academic Press, New York, 

1971, pp. 25-40. 

I0. MacVelgh, Donald T., "Effect of Data Representation 

on Cost of Sparse Matrix Operations," 

Acta Informatlca , Vol. 7, 1977, 

pp. 361-394. 

ii. Maurer, Herman H., "Data Structures and Progranm~Ing 

Techniques," Translated by Camille C~ 

Price, Prentice-Hall, Inc., Englewood cliffs, 

N.J., 1977. 

12. Pooch, U.W. and Nieder, A., "A Survey of Indexing 

Techniques for Sparse Matrices," ACM 

Computing Surveys, pp. 109-133, Vol. 5. No. 2, 

June 1973. 

13. Ramlrez, J., "Automatic Generation of Data 

Converslon-Programs Using a Data Description 

431

Language (DDL)," Ph.D. Dissertation, University 

of Pennsylvania, 1973. 

14. 

15. 

16. 

17. 

18. 

19. 

Rosenberg, A.L. and Stockmeyer, .L., "Storage 

Schemes for Boundedly Extendible Arrays," 

Acta Informatlca, 7, 1977, pp. 289-303. 

Scheuermann, Peter, "On the Design and Evaluation 

of Data Bases," IEEE Computer, Feb. 1978, 

pp. 46-54. 

Scheuermann, Peter, "Concepts of a Data Base 

Simulation Language", Proc. ACM SIGMOD Int'l. 

Conf. on Management of Data, 1977, pp. 144-156. 

Shu, N.C. et al., "EXPRESS: A Data EXtraction, 

Processing and REStructuring System," ACM 

Trans. Database Systems, Vol. 2, No. 2, 

June 1977, pp. 134-174. 

Taylor, Robert W., "Generalized Data Base 

Management System Data Structures and their 

Mapping to Physical Storage," Ph.D. dissertation, 

Univ. of Michigan, 1971. 

Wetherell, Ca. and Shannon, A., "LR Automatic 

Parser Generator and LR(1) Parser," Lawrence 

Livermore Lab., University of California, 

P.O. Box 808, Livermore, CA 94550, June 14, 

1979. 

432

A DATA DEFINITION AND MAPPING LANGUAGE FOR ...

Create successful ePaper yourself

Delete template?

Save as template?