Proceedings of the
First International Workshop on
Model-Driven Interoperability (MDI 2010)
In conjunction with MoDELS 2010
Oslo, Norway, October 3-5, 2010
http://mdi2010.lcc.uma.es/
ACM International Conference Proceedings Series
ACM Press
Editors:
Jean Bézivin, INRIA & Ecole de Mines de Nantes, France
Richard Mark Soley, OMG, Needham, USA
Antonio Vallecillo, University of Málaga, Spain
ISBN: 978-1-4503-0292-0
The Association for Computing Machinery
2 Penn Plaza, Suite 701
New York New York 10121-0701
ACM COPYRIGHT NOTICE. Copyright © 2007 by the Association for Computing
Machinery, Inc. Permission to make digital or hard copies of part or all of this work
for personal or classroom use is granted without fee provided that copies are not
made or distributed for profit or commercial advantage and that copies bear this
notice and the full citation on the first page. Copyrights for components of this work
owned by others than ACM must be honored. Abstracting with credit is permitted. To
copy otherwise, to republish, to post on servers, or to redistribute to lists, requires
prior specific permission and/or a fee. Request permissions from Publications Dept.,
ACM, Inc., fax +1 (212) 869-0481, or permissions@acm.org
For other copying of articles that carry a code at the bottom of the first or last page,
copying is permitted provided that the per-copy fee indicated in the code is paid through
the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, +1-978-7508400, +1-978-750-4470 (fax).
Notice to Past Authors of ACM-Published Articles
ACM intends to create a complete electronic archive of all articles and/or other material
previously published by ACM. If you have written a work that was previously published by
ACM in any journal or conference proceedings prior to 1978, or any SIG Newsletter at any
time, and you do NOT want this work to appear in the ACM Digital Library, please inform
permissions@acm.org, stating the title of the work, the author(s), and where and when
published.
ACM ISBN: 978-1-4503-0292-0
TABLE OF CONTENTS
Editorial to the MDI 2010 Workshop
Jean Bézivin, Richard M. Soley and Antonio Vallecillo…………................................. 1
Model Driven Interoperability in practice: preliminary evidences and issues from an
industrial project
Youness Lemrabet, David Clin, Michel Bigand, Jean-Pierre Bourey
and Nordine Benkeltoum ……………………………………………………………..... 3
Semantic Interoperability of Clinical Data
Idoia Berges, Jesus Bermudez, Alfredo Goñi and Arantza Illarramendi ..……………. 10
A Process Model Discovery Approach for Enabling Model Interoperability in Signal
Engineering
Wikan Danar Sunindyo, Thomas Moser, Dietmar Winkler and Stefan Biffl …………. 15
Efficient Analysis and Execution of Correct and Complete Model Transformations
Based on Triple Graph Grammars
Frank Hermann, Hartmut Ehrig, Ulrike Golas and Fernando Orejas............................. 22
Towards an Expressivity Benchmark for Mappings based on a Systematic Classification
of Heterogeneities
Manuel Wimmer, Gerti Kappel, Angelika Kusel, Werner Retschitzegger,
Johannes Schoenboeck and Wieland Schwinger ……………………………………… 32
Specifying Overlaps of Heterogeneous Models for Global Consistency Checking
Zinovy Diskin, Yingfei Xiong and Krzysztof Czarnecki ……………………………... 42
Anticipating Unanticipated Tool Interoperability using Role Models
Mirko Seifert, Christian Wende and Uwe Assmann .…………………………………. 52
Aligning Business and IT Models in Service-Oriented Architectures using BPMN and
SoaML
Brian Elvesæter, Dima Panfilenko, Sven Jacobi and Christian Hahn…………………. 61
Domain-specific Templates for Refinement Transformations
Lucia Kapova, Thomas Goldschmidt, Jens Happe and Ralf Reussner ………………. 69
Advanced Modelling Made Simple with the Gmodel Metalanguage
Jorn Bettin and Tony Clark ..………………………………………………………….. 79
Model-driven Rule-based Mediation in XML Data Exchange
Yongxin Liao, Dumitru Roman and Arne J. Berre …………………………………… 89
Behavioural Interoperability to Support Model-Driven Systems Integration
Alek Radjenovic and Richard Paige ………………………………………………….. 98
iii
List of Authors
Assmann, Uwe
Benkeltoum, Nordine
Berges, Idoia
Bermudez, Jesus
Berre, Arne.J
Bettin, Jorn
Biffl, Stefan
Bigand, Michel
Bourey, Jean-Pierre
Clark, Tony
Clin, David
Czarnecki, Krzysztof
Diskin, Zinovy
Ehrig, Hartmut
Elvesaeter, Brian
Goñi, Alfredo
Golas, Ulrike
Goldschmidt, Thomas
Hahn, Christian
Happe, Jens
Hermann, Frank
Illarramendi, Arantza
52
3
10
10
89
79
15
3
3
79
3
42
42
22
61
10
22
69
61
69
22
10
Jacobi, Sven
Kapova, Lucia
Kappel, Gerti
Kusel, Angelika
Lemrabet, Youness
Liao, Yongxin
Moser, Thomas
Orejas, Fernando
Paige, Richard
Panlenko, Dima
Radjenovic, Alek
Retschitzegger, Werner
Reussner, Ralf
Roman, Dumitru
Schoenboeck, Johannes
Schwinger, Wieland
Seifert, Mirko
Sunindyo, Wikan Danar
Wende, Christian
Wimmer, Manuel
Winkler, Dietmar
Xiong, Yingfei
61
69
32
32
3
89
15
22
98
61
98
32
69
89
32
32
52
15
52
32
15
42
Program Committee
Patrick Albert
Uwe Assmann
Colin Atkinson
Jorn Bettin
Jean Pierre Bourey
Tony Clark
Robert Clarisó
Gregor Engels
Jean Marie Favre
Robert France
Dragan Gasevic
Sébastien Gérard
Martin Gogolla
Jeff Gray
Esther Guerra
Tihamer Levendovszky
Richard Paige
Alfonso Pierantonio
Bernhard Rumpe
Jim Steel
Hans Vangheluwe
Andrew Watson
Jon Whittle
Manuel Wimmer
IBM, France
Technische Universitat Dresden, Germany
University of Mannheim, Germany
Sofismo AG, Switzerland
Laboratoire de Génie Industriel de Lille, France
Middlesex University, UK
Universitat Oberta de Catalunya, Spain
University of Paderborn, Germany
University of Grenoble, France
Colorado University, USA
Atabasca University, Canada
CEA LIST, France
University of Bremen, Germany
University of Alabama, USA
Carlos III University, Spain
Vanderbilt University, USA
University of York, UK
University of L'Aquila, Italy
Aachen University, Germany
Queensland University of Technology, Australia
University of Antwerp, Belgium
OMG, Needham, USA
Lancaster University, UK
Viena University of Technology, Austria
Additional reviewers
Fabian Buettner, Lars Hamann, Mirco Kuhlmann, Ivano Malavolta, Antonio Navarro Perez,
Ingo Weisemoeller, Christian Wende, Claas Wilke.
iv
Editorial to the Proceedings of the First International
Workshop on Model-Driven Interoperability
Jean Bézivin
Richard Mark Soley
Antonio Vallecillo
INRIA and Ecole de Mines de Nantes
4 rue Alfred Kastler - F-44307 Nantes
Cedex 3 - France
+33 251 858 704
Object Management Group, Inc.
Building A, Suite 300.140 Kendrick
Street. Needham, MA 02494
+1 781 444 0404
Universidad de Málaga
Bulevar Louis Pasteur 35
29071 Málaga, Spain
+34 952 132794
Jean.Bezivin@inria.fr
soley@omg.org
av@lcc.uma.es
Model interoperability is much more complex than simply
defining a local serialization format, e.g., XMI. This would just
resolve the syntactic (or “plumbing”) issues between models and
modeling tools. However, interoperability should also involve
further aspects, including behavioral specifications of models
(which in turn describe the behavioral aspects of the systems
being modeled), and other “semantic” issues [2] such as
agreements on names, context-sensitive information, agreements
on concepts (ontologies), integration conflict analysis (including
for example automatic data model matching), semantic reasoning,
etc. Furthermore, interoperability not only means being able to
exchange information and to use the information that has been
exchanged [3], but also to exchange services and functions to
operate effectively together. All these interoperability issues and
needs become clear in any complex system, as it has recently
happened in the HL/7 and DICOM healthcare projects, for
instance.
ABSTRACT
This paper describes the scope, structure and contents of the First
International Workshop on Model Driven Interoperability
(MDI 2010), which was held on October 5, 2010, in conjunction
with the MoDELS 2010 conference in Oslo, Norway.
Categories and Subject Descriptors
D.2.12 [Software Engineering]: Interoperability. I.6.5
[Simulation and Modeling]: Model Development – Modeling
methodologies.
General Terms
Design, Standardization, Languages.
Keywords
Model-driven engineering, interoperability.
Models and MDE techniques (especially metamodeling and
model transformations) can play a fundamental role for fully
accomplishing these tasks. Thus, models can become cornerstone
elements for enabling and achieving interoperability between all
kinds of systems and artifacts, including data sets (under the
presence of different data schemata, and possibly at different
levels of abstraction), services (despite their differences in data
representation, access protocols and underlying technological
platforms), event systems (with different complex types and
origins), languages (that use different notations and may have
different semantics), tools (with different data formats and
semantic representations), technological platforms (with different
notations, tools and semantics), etc. It should also be emphasized
that the success of MDE has created accidental complexity, for
example by generating a number of overlapping metamodels
(UML, SySML, BPML, etc.) and this situation reveals itself in a
number of contexts as an additional metamodel interoperability
problem.
1. INTRODUCTION
Interoperability is the ability of separate entities, systems or
artifacts (organizations, programs, tools, etc.) to work together.
Although there has always been the need to achieve
interoperability between heterogeneous systems and notations [1],
the difficulties involved in overcoming their differences, the lack
of consensus on the common standards to use and the shortage of
proper mechanisms and tools, have severely hampered this task.
Model-Driven Engineering (MDE) is an emergent discipline
that advocates the use of (software) models as primary artifacts of
the software engineering process. In addition to the initial goals of
being useful to capture user requirements and architectural
concerns, and to generate code from them, models are proving to
be effective for many other engineering tasks. New model-driven
engineering approaches, such as model-driven modernization,
models-at-runtime, model-based testing, etc. are constantly
emerging.
2. THE MDI 2010 WORKSHOP
The goal of the MDI2010 workshop was to discuss the
potential role of models as key enablers for interoperability, and
the challenges ahead. The workshop aims to provide a venue
where researchers and practitioners concerned with all aspects of
models and systems interoperability could meet, disseminate and
exchange ideas and problems, identify some of the key issues
related to model-driven interoperability, and explore together
possible solutions.
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that
copies bear this notice and the full citation on the first page. To copy
otherwise, or republish, to post on servers or to redistribute to lists,
requires prior specific permission and/or a fee.
MDI2010, October 5, 2010, Oslo, Norway.
Copyright 2010 ACM 978-1-4503-0292-0/10/10…$10.00.
1
The MDI 2010 workshop was held on October 5, 2010, in
conjunction with the MoDELS 2010 conference in Oslo, Norway.
The Workshop was a huge success. An excellent Program
Committee was assembled to help with the review process, which
included very well-known and respected experts in the topics of
the workshop: Patrick Albert, Uwe Assmann, Colin Atkinson,
Jorn Bettin, Jean Pierre Bourey, Tony Clark, Robert Clarisó,
Gregor Engels, Jean Marie Favre, Robert France, Dragan Gasevic,
Sébastien Gérard, Martin Gogolla, Jeff Gray, Esther Guerra,
Tihamer Levendovszky, Richard Paige, Alfonso Pierantonio,
Bernhard Rumpe, Jim Steel, Hans Vangheluwe, Andrew Watson,
Jon Whittle and Manuel Wimmer.
In response to the call for papers, a total of 19 submissions
were received. Submitted papers were formally peer-reviewed by
three referees, and 12 papers were finally accepted for
presentation at the workshop and publication at the Proceedings,
that have been published in the ACM Digital Library.
We counted with some external reviewers that helped PC
members to review the papers: Fabian Buettner, Lars Hamann,
Mirco Kuhlmann, Ivano Malavolta, Antonio Navarro Perez, Ingo
Weisemoeller, Christian Wende and Claas Wilke.
These papers contribute in different aspects to the area of
model driven interoperability, from its foundations to the potential
benefits it may bring to the emerging field of MDE.
The workshop was organized in four sessions. The first three
were dedicated to the presentation of the selected papers. The last
session was dedicated to discussions among the participants about
the open issues and topics identified during the paper
presentations.
4. ACKNOWLEDGMENTS
We would like to thank the MoDELS 2010 organization for
giving us the opportunity to organize this Workshop, especially to
the Workshop Chairs, Juergen Dingel and Arnor Solberg. Many
thanks to all those that submitted papers, and particularly to the
contributing authors. Our gratitude also goes to the paper
reviewers and the members of the MDI 2010 Program Committee,
for their timely and accurate reviews and for their help in
choosing and improving the selected papers. Finally we would
like to acknowledge the research projects TIN2008-03107 and
P07-TIC-03184 that have helped supporting this workshop.
3. WORKSHOP PAPERS
The following 12 papers were presented in the workshop:
Retschitzegger, Johannes Schoenboeck and Wieland
Schwinger.
“Specifying Overlaps of Heterogeneous Models for Global
Consistency Checking” by Zinovy Diskin, Yingfei Xiong
and Krzysztof Czarnecki.
“Anticipating Unanticipated Tool Interoperability using Role
Models” by Mirko Seifert, Christian Wende and Uwe
Assmann.
“Behavioural Interoperability to Support Model-Driven
Systems Integration” by Alek Radjenovic and Richard Paige.
“Aligning Business and IT Models in Service-Oriented
Architectures using BPMN and SoaML” by Brian Elvesæter,
Dima Panfilenko, Sven Jacobi and Christian Hahn.
“Domain-specific
Templates
for
Refinement
Transformations” by Lucia Kapova , Thomas Goldschmidt ,
Jens Happe and Ralf Reussner.
“Advanced Modelling Made Simple with the Gmodel
Metalanguage” by Jorn Bettin and Tony Clark.
“Model-driven Rule-based Mediation in XML Data” by
Yongxin Liao, Dumitru Roman and Arne.J. Berre.
“Model Driven Interoperability in practice: preliminary
evidences and issues from an industrial project” by Youness
Lemrabet, David Clin, Michel Bigand, Jean-Pierre Bourey
and Nordine Benkeltoum.
“Semantic Interoperability of Clinical Data Exchange” by
Idoia Berges, Jesús Bermudez, Alfredo Goñi and Arantza
Illarramendi.
“A Process Model Discovery Approach for Enabling Model
Interoperability in Signal Engineering” by Wikan Danar
Sunindyo and Thomas Moser.
“Efficient Analysis and Execution of Correct and Complete
Model Transformations Based on Triple Graph Grammars”
by Frank Hermann, Hartmut Ehrig, Ulrike Golas and
Fernando Orejas.
“Towards an Expressivity Benchmark for Mappings based on
a Systematic Classification of Heterogeneities” by Manuel
Wimmer, Gerti Kappel, Angelika Kusel, Werner
5. REFERENCES
[1] Wegner, P., Interoperability, ACM Comput. Surv., 28, 1
(March 1996), 285-287
[2] Heiler. S., Semantic interoperability. ACM Comput. Surv. 27,
2 (Jun. 1995), 271-273.
[3]
2
Institute of Electrical and Electronics Engineers. IEEE
Standard Computer Dictionary: A Compilation of IEEE
Standard Computer Glossaries. 1990.
Model Driven Interoperability in practice: preliminary
evidences and issues from an industrial project
Youness Lemrabet
Univ Lille Nord de France, F-59000
Lille, France
Michel Bigand
Univ Lille Nord de France, F-59000
Lille, France
David.Clin
Univ Lille Nord de France, F-59000
Lille, France
LM2O, Ecole Centrale de Lille, BP48 LM2O, Ecole Centrale de Lille, BP48 LM2O, Ecole Centrale de Lille, BP48
59651 Villeneuve d'Ascq cedex,
59651 Villeneuve d'Ascq cedex,
59651 Villeneuve d'Ascq cedex,
France.
France.
France.
(+33) 3 20 33 54 60
(+33) 3 20 67 60 25
(+33) 6 71 15 33 55
Youness.Lemrabet@centralienslille.org
Michel.Bigand@ec-lille.fr
David.clin@ec-lille.fr
Jean-Pierre Bourey
Univ Lille Nord de France, F-59000
Lille, France
Nordine BENKELTOUM
Univ Lille Nord de France, F-59000
Lille, France
LM2O, Ecole Centrale de Lille, BP48
59651 Villeneuve d'Ascq cedex,
France.
(+33) 320 33 54 08
LM2O, Ecole Centrale de Lille, BP48
59651 Villeneuve d'Ascq cedex,
France.
(+33) 20 67 60 25
Jean-Pierre.Bourey@ec-lille.fr
nordine.benkeltoum@ec-lille.fr
ABSTRACT
1. INTRODUCTION
Problems of interoperability inside and outside organizations have
recently been the subject of considerable amount of studies.
Although the Model Driven Interoperability (MDI) and Service
Oriented Architecture approaches are widely accepted among
scholars to improve interoperability, little was known about the
ins and outs of the combination between these approaches in
practice. This article is based on an industrial project called
ASICOM which aimed at building a platform that enables
interoperability among industrial partners. It suggests some
preliminary evidences and issues for both theories and practices.
Interoperability is defined as “the ability of two or more systems
or components to exchange information and to use the
information that has been exchanged” [1]. It is an important issue
for Information Systems (IS) practitioners since the growing need
of integration of heterogeneous IS. Enterprises and more widely
organizations meet problems that are similar to the lack of
interoperability. Enterprises have to fit their functions and
processes taking into consideration internal and external
constraints. Thanks to this strategy they are able to take advantage
of new business opportunities and improve their competitiveness
by delivering high quality products/services while keeping the
production cost as low as possible [2].
Categories and Subject Descriptors
[Software Engineering]: Interoperability
Recent studies show that Model Driven Interoperability (MDI)
and a Service Oriented Architecture (SOA) can be combined to
support interoperability [3]. The main research question of this
article is the following: “how combine MDI and SOA approaches
in a collaborative context to improve interoperability and
strategic alignment of IS?” The paper reflects on aspects of
enterprise interoperability within the framework of the ASICOM
project.
[Simulation and Modeling]: Model Development – Modeling
methodologies.
General Terms
Experimentation, Languages
Keywords
ASICOM project aimed at providing Small and Medium
Enterprises (SMEs) from trade and logistics sectors with a
pragmatic and generic approach that allows to set up simplified,
interoperable
and
adaptable
solutions
that
improve
communication with their partners throughout dematerialization.
More precisely, the ASICOM project focuses on customers (firm
from retail industry and stockiest) relations to make administrative
procedures easier (i.e.: goods clearance procedure, customs’
duties payment). Furthermore, it will allow SME’s to manage
their customers’ bonded warehouse in which dutiable goods are
Model Driven Interoperability (MDI), Business Process
Management (BPM), Business Process Modeling Notation
(BPMN), Service oriented architecture (SOA), ATHENA
Interoperability framework (AIF).
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that
copies bear this notice and the full citation on the first page. To copy
otherwise, or republish, to post on servers or to redistribute to lists,
requires prior specific permission and/or a fee.
MDI2010, October 5, 2010, Oslo, Norway.
Copyright 2010 ACM 978-1-4503-0292-0/10/10...$10.00
3
stored and manipulated without payment of duty and also to
communicate with the French Administrative custom systems like
Delt@D and Delt@C systems using dematerialized documents. In
the ASICOM project the Service Oriented Architecture was
chosen to guide and facilitate alignment effort between business
models and IT models. SOA provides the required flexibility to
integrate new SMEs to the ASICOM project.
Based on our own implication in the ASICOM project, the
existing interoperability framework and modeling practices, we
identified the elements that have to be taken into consideration in
an enterprise interoperability project. Especially for projects that
use model-driven development and service-oriented architecture
as a key solution to tackle the interoperability problem. To
facilitate the interoperability and communication at both the
modeling and the technical levels, we assume the use of existence
modeling practices and standard notations such as Model-Driven
Architecture (MDA)1, Business Motivation Model (BMM)2,
Business Process Modeling Notation (BPMN)3 , Service oriented
architecture Modeling Language (SoaML)4, Business, Business
Process Execution Language (BPEL)5, Unified Modeling
Language (UML)6, eXtended Markup Language (XML)7, and
Web Service Description Language (WSDL)8.
Figure 1. Reference Model for MDI.
The Model Driven Interoperability (MDI) proposal [4] explains
how a model-driven approach can be a useful way to solve
interoperability problems. It attempts to introduce different
abstraction levels to reduce the gap between enterprise models
and code level. The level definition is based on the three levels of
MDA: CIM, PIM, and PSM.
The remainder of this paper is divided into four parts. The second
section deals with the state of the art on MDI, SOA and SoaML.
The third section introduces advantages and challenges for model
driven systems and a reflection on the combination of MDI and
SOA approaches to support interoperability through the ATHENA
Interoperability framework (AIF). It describes also evidences and
issues from the ASICOM project. The paper closes by a
conclusion and describes further research.
A considerable number of interoperability frameworks have
evolved during the last 10 years [5]. Project like ATHENA [6]
provides interoperability frameworks that explain how MDD
should be applied in software engineering practice and support
business interoperability. ATHENA Interoperability Framework
(AIF) describes each system by enterprise models and different
aspects. It focuses on the provided and required artifacts of each
collaborating systems inside or outside an enterprise. In the AIF,
interoperations
take
place
at
different
viewpoints:
enterprise/business, process, service and information/data. At each
viewpoint a model-driven interoperability is prescribed.
2. RELATED WORK
2.1 Overview of MDI
Model-Driven Development (MDD), and in particular OMG’s
MDA is emerging as a standard in practice to develop model
driven applications and systems. Figure 1 presents the Reference
Model for MDI.
1
http://www.omg.org/cgi-bin/doc?omg/03-06-01.pdf
2
http://www.omg.org/cgi-bin/doc?formal/08-08- 02.pdf
3
http://www.omg.org/spec/BPMN/2.0
4
http://www.omg.org/spec/SoaML/1.0/Beta2/
5
http://docs.oasis-open.org/wsbpel/2.0/OS/wsbpel-v2.0-OS.pdf
6
http://www.omg.org/spec/UML/2.2/
7
http://www.w3.org/TR/2008/REC-xml-20081126/
8
http://www.w3.org/TR/2008/REC-xml-20081126/
Figure 2. AIF conceptual framework-simplistic view.
Figure 2 is derived from the ATHENA Interoperability
Framework. It gives a simplistic view of the reference model that
indicates the required and provided artifacts of two collaborating
enterprises. Each enterprise is described by enterprise models and
different viewpoints (business, process, service, information) on
4
using Unified Modeling Language10 (UML). It is a set of
extensions to UML that define SOA concepts and support service
modeling and designing [10]. The goal of SoaML is also to
support automatic generation of SOA derived artifacts following
an MDA approach.
different abstract levels. For [7] interoperations are only
meaningful where all aspects of an enterprise are addressed.
2.2 Overview of SOA
The expression service-oriented architecture (SOA) refers to a
way of organizing and understanding organizations, communities
and systems to maximize agility and scale. Thus it is also seen as
an architectural approach, guideline and pattern to realize a
system through a set of provided and required services.
SoaML offer several benefits such as:
SOA is technology independent; it means that the choice of
technologies and tools is secondary. Various technologies might
be used to support SOA implementation. According to a recent
research [8], “to achieve its potential, an SOA needs to be
business-relevant, thus driven by the business and implemented to
support the business”.
2.2.1 SOA infrastructure Patterns
While SOA infrastructure is far from being sufficient in making
SOA work, it is a necessary component that underlies any
architectural approach [9]. It is crucial to understand the merit of
each infrastructure pattern before choosing a style of
infrastructure (See Section 3.3.2). It is also important to note that
a discussion about the targeted SOA infrastructure patterns does
not map a specific vendor or open source application
infrastructure on one to one basis. Many products implement a
hybrid infrastructure patterns. According to [9] there are four
SOA infrastructure patterns: First the service container
infrastructure pattern: In this pattern the service can be
implemented on a “container” that provide a runtime environment
which coordinates the service interactions by marshalling request
to and from the service. For example in this type of infrastructure
a service can be implemented as a servlet in an application server
platform. Then, the hub-and-spoke infrastructure pattern, in
which an integration middleware platform acts as the coordinate
point for all the interactions between services, this coordination
point interacts with services through adapters. This pattern is
known as Enterprise Application Integration. The third pattern is
the centralized messaging infrastructure pattern which
leverage
message-oriented
middleware
and messaging
infrastructure to coordinate messages between services (managing
the messages matters more than managing the specific runtime
endpoint). Consequently, rather than connecting to Service
endpoints through adapters or a hub-and-spoke approach, one
simply needs to instrument the end points to utilize a particular
message bus or publish/subscribe infrastructure. And fourthly, the
network intermediary as infrastructure pattern, in this pattern
the challenge is to use a single standard for system interoperability
at the seventh layer of the OSI9 network. Intelligent network can
be used as SOA infrastructure to intermediate the interactions
between services. To perform this feature the seventh layer of the
OSI model network must be more specific, intelligent, and
enabled with respect to Services [9].
enabling a community or organization to work together
using SOA services at a higher level of abstraction;
addressing service interaction concerns at the
architectural levels by using architecture as the bridge
between business requirements and automated IT
solutions;
Leveraging
standards.
and
integrating
with
existing
OMG
3. SOLUTION AND LESSONS LEARNED:
SOA TO RATIONALISE MDI
Our work has been inspired by will-known and existing
frameworks. Several engineering methods and frameworks
dealing with the design, construction, implementation, governance
and tools for the development of information systems exist. These
methods belong to the following areas: (i) Model driven
development (MDD) frameworks. (ii) Enterprise architecture
(EA) methodologies frameworks. (iii) Service oriented
development methodologies and frameworks. The most known
Enterprise Modelling Frameworks and Architectures are: The
Zachman Framework [11], The Open Group Architecture
Framework [12], The GERAM Framework from ISO IS
15704:2000 [13], the GIM architecture [14], The CIMOSA
Framework [15] and Praxeme methodology [16]. However these
frameworks don’t give a special focus on interoperability
problems. Other projects (Shape11 and Bsopt12) aim to support the
development of enterprise systems by developing a methodology
bac ed by the concepts of SOA’s and a model driven engineering
tool set.
ATHENA project is based on a multidisciplinary approach that
combine three research fields to support the development of
enterprise interoperability [7]: (i) enterprise modeling which
defines interoperability requirements and supports solution
implementation, (ii) architectures and platforms which provide the
technological base of interoperability system, and (iii) ontology
which identifies interoperability semantics in the enterprise. We
do not take into consideration the ontology since we consider it to
be outside the scope of this study. We rather focus on enterprise
modeling and architecture and platforms areas.
The idea of interoperability is multi-faces. Actually, it is necessary
to distinguish the interoperability concepts. Using the AIF and
MDI approaches, we suggest to use a grid to capture the good
2.2.2 Overview of SoaML
The OMG standard Service oriented architecture Modelling
Language (SoaML) is aimed at taking advantages of SOA. SoaML
provides a new way of designing and modelling SOA solutions
9
allowing service interoperability at the model level;
Open Systems Interconnection
5
10
http://www.omg.org/spec/UML/2.3/
11
http://www.shape-project.eu
12
http://www.bsopt.at/
practices at each level of MDI (CIM, PIM, PSM) for each level
defined in the AIF (business, process, service and information).
3.1.3 PSM level
At this level each partners must first choose a style of architecture
to implement (e.g. SOA), and then understand the various styles
of projects to build. The Gartner Group identifies three styles of
projects (which are not detailed here) [21] [22]:
Table 1. MDI approach in each aspect of AIF.
CIM
PIM
PSM
Business
Process
Service
Data
Table 1 aims to give a holistic perspective on interoperability to
allow each partner to analyze and understand their business needs
and technical requirements. This grid defines interoperability
components as a set of sub-domains. The interaction of a level
(line) and an aspect (column) constitutes a sub-domain. The 12
sub-domains of interoperability make easier the definition of
expertise area among partners. However the fulfillment of all subdomains is not a sign of excellence or maturity. A partner is fully
interoperable in the sense that new business relationships can be
done at low costs [17]. This section does not address the full
scope of each AIF aspect, but rather suggests an overview of the
main issues of this project. We will give examples and describe
each level of interoperability based on our experience in the
ASICOM project. The following gives an overview of the central
formalisms and concepts as well as the methods of each level of
the matrix.
Execute a new and complete SOA approach: the
primary objective of these projects is the design,
creation and execution of new SOA artifacts.
Composite applications and business process support:
the primary objective of these projects is the assembly
and deployment of composite applications and
processes. Orchestration of services in support of an
application process is important. The focus of these
projects is on combining existing functionality rather
than creating new business functionality.
Application integration: the primary objective of these
projects is the integration of the data and business logic
of applications.
3.2 Process Aspect
Business process models contain what has to be done in the
business to achieve the business goals and vision [8]. Business
analyst starts by distinguishing business processes from goals and
models. The OMG specification BPMN can be used to capture the
business processes that can be shared between the stakeholders.
BPMN is very expressive and provides a notation that is intuitive
to business users. In this methodology, business processes are
designed to cover many types of modeling and can be used at
different levels of details (CIM and PIM).
3.1 Business aspect
Interoperability at this level is seen as the organizational and
operational ability of an enterprise to cooperate with external
organization in spite of different working practices, legislations,
cultures and commercial approaches [18]. Cooperating partners
must have a compatible vision and focusing on the same elements
[19]. Thus each partner must start by focusing on its business
goals and project objectives using business modeling practices.
BMM should be used to define clear goals and objectives of each
partner. An industrial network is not a stable and permanent
entity, business objectives of each partner can change, and this
evolution must be taken into account.
3.2.1 CIM level
BPMN choreography diagrams which focus on the exchange of
information between participants can be used at this level to
create the initial drafts of processes. Nevertheless, these models
must be further refine and related to other kind of BPMN models.
The BPMN 2.0 specifies that implementation is not expected to
support directly choreography modeling elements.
3.2.2 PIM level
The BPMN choreography Business process can be refined using
BPMN collaboration diagrams which describe in details
collaboration between participants. First, the business analyst has
to identify two types of business processes: (i) public business
processes that involve by the interaction with the partners and (ii)
private business processes under the ownership control of each
participant. Then it has to identify the parts of the process to
computerize.
3.1.1 CIM level
At this level partners have to find the factors that motivate the
establishment of business plans and business perspectives by
interviews involving relevant stakeholders and workshops.
3.1.2 PIM level
The PIM specifies the elements of business plans and stresses on
the description of business goals, tactics and rules. It is necessary
to define the interoperability approach and then to choose the
project style that will be implemented. According to ISO 14258
(1990), there are three ways to establish interoperations between
related systems (which are not details here) [20]: (i) Integrated
approach (ii) Unified approach (iii) Federal approach. In the
ASICOM project, none of the partners imposes their models,
languages and methods of work: (i) all partners do not use a
common format for all models (not integrated) and (ii) there is no
common meta-model between partners (not Unified). The chosen
way to tackle interoperability issue is the federal interoperability
approach.
Then BPMN models are mapped to more technical models on the
PSM level using Business Process Execution Language (BPEL).
3.2.3 PSM level
BPMN specifications explicitly suggest BPEL to be used for the
execution of business process. After the description of
orchestration process in BPMN, they can be formalized and
refined with the implementation details using BPEL that describes
how the partners collaborate.
6
processes have to be able to respond to changes in the customs
regulations through a single shared business solution.
3.3 Service aspect
The main concern at this aspect is to identify SOA services which
can be used to enable business agility through business processes
reuse. This viewpoint bridges the gap between business
requirements and a service based solution. According to [23],
BPMN model does not contain all information needed to
implement SOA. Consequently, modeling services can be
supported by SoaML formalism which can be used to model
services at the CIM level and then subsequently refining them
towards a platform-specific implementation. SOA has been
associated with a variety of approaches such as Service-Oriented
Analysis and Design (SOAD) [24], Service-Oriented Modeling
and Architecture (SOMA) [25], Praxeme [9]) and technologies
such as Enterprise Service Bus (ESB). The different approaches
are intended to identify SOA services. The service-centric
approach suggested by [5], argues that a goal-driven identification
of services allows a better strategic alignment. In this approach
BMM and SoaML are used to describe the realization of
interoperability through business services. This approach
proposes to map BMM to business services instead of business
process to reduce the complexity introduced by the interorganizational business process.
At this level SoaML models should be used to give to support IT
concerns. The most used SoaML concepts are: ServiceInterface
and MessageType.
3.3.3 PSM level
A lot of products propose different infrastructure patterns or
hybrids SOA infrastructural approaches. At this level, it is
important to provide an answer to this question: what type of SOA
solution to implement? Architect must care about two points: the
targeted technologies to implement SOA services and the
application infrastructure13 to support SOA solutions. Both points
will be discussed below.
The architect must specify the implementation artifacts of the
services-oriented architecture of the chosen technology, e.g. Web
Services, Java Enterprise Edition (JEE), .NET, multi-agent
systems (MAS). Then it has also to choose the adapted application
infrastructure to implement SOA solutions: The diversity,
heterogeneity of application solutions, business processes, and the
business context of each partner must be considered [7]. In the
ASICOM project, two target infrastructures have been tested to
support SOA solutions: the open source Petals ESB from the
Petals SOA Suite14 and the Business Process Management
solution BizAgi15. BizAgi BPM suite is very intuitive but it
suffers from the limitation of the technology (i.e.: it does not
support the execution of BPEL and it provides access to existing
applications through Web services only). Consequently, we chose
the standard-based integration platform Petals ESB as application
infrastructure.
The implementation of each interoperability approach can be
supported by one or many SOA infrastructure patterns (see
Section 2.4.1).
3.3.1 CIM level
To work together the participant must agree about a formalism to
describe services at a high level of abstraction. SoaML can be
used to model services both at CIM and PIM levels. For more
details the MDSE methodology [26] and IBM [8] provide
guidelines for how to use SoaML to define and specify a serviceoriented architecture. SoaML concepts as Capability, Participant,
ServiceArchitacture and ServiceContract can be used at this level.
Those concepts give a top view and describe the communication
between different participants. They are used to express the
business operations supported with the service-oriented
architecture.
In the ASICOM project we chose Web Services and Java
Enterprise Edition (JEE) technologies to build upon SOA
services. Thus WSDL and XML Schema Definition Language16
(XSD) are used to support syntactical Interoperation.
3.4 Data aspect
The Data models have to be studied in parallel with the process
and service models. A traditional item in service and process
modeling is to create and manipulate the information. As we said
in section 3.3 the data structure deficit is evident in BPMN; the
concept of message flow is not supported by data models [25].
Data and information models are out of the scope of BPMN and
UML class diagrams can be used to describe messages.
3.3.2 PIM level
Even if implementing SOA should not depend on a SOA platform
strategy, enterprises have to define a SOA platform target and a
SOA infrastructure patterns (see Section 2.4.1). The partners do
not have to choose a specific product at this level, but the
discussion about the target SOA infrastructure patterns is very
important. It is imperative for each partner to understand the
advantages and drawbacks of each SOA infrastructure patterns.
Thus, the definition of the SOA patterns must be strongly
motivated by the interoperability approach chosen at the PIM
level of the Business aspect (see Section 3.1.2). In the ASICOM
project, a Mediation Information System (MIS) was chosen to
support the mediation interoperability approach. The MIS is in
charge of (i) information exchange, (ii) services sharing and (iii)
behavior orchestration [27]. And at least the MIS must
implements the centralized messaging infrastructure pattern.
In this section we present a very simple example from the
ASICOM warehouse management module data-model and we
show how it can be refined to generate a physical model which
represents the relational database concepts. The warehouse
In the ASICOM project a MIS seems to be a pertinent way of
supporting interoperability for three reasons. Firstly, the members
of the ASICOM project need to communicate with their own
channels. Secondly, their systems are not adapted to exchange
information between each other. Thirdly, the collaborative
7
13
Gartner has defined a new category of software called
“Application Infrastructure”. “Application infrastructure
includes the majority of runtime middleware, as well as
application development and management tools that support
the new generation of applications, based on service-oriented
architecture (SOA), event-driven architecture (EDA) and
business process management (BPM)” [21]
14
http://www.petalslink.com/en
15
http://www.bizagi.com/
16
http://www.w3.org/TR/xmlschema-2/
management module provides functionality to manage multi and
structured stock locations.
resolve the interoperability problem. This initial work shows that
model-driven approach and service-oriented architecture enhance
interoperability. However, a number of challenges must be
overcome.
3.4.1 CIM level
UML has become widely used in object-oriented system modeling
such as J2EE and .NET. A first version of conceptual data-model
can be done at this level figure 3. This model permits to identify
the different entities and how they relate to one another.
The ASICOM project is based on a processes-centered approach
which associates methodologies, information technologies and
governance. It aims to allow people from different background to
collaborate together on a project of interoperability.
Our next goal is to refine and investigated further in detail the
relation between aspects. We believe that being able to model
service orchestration with BPMN and BPEL and services details
with SoaML to generate SOA artifacts is an important step to
solve interoperability problem. We will continue to work on
services modeling and transformation, in particular using the
Software and Systems Process Engineering Meta-Model (SPEM)
to defining the development process in an interoperable project
using a service-oriented architecture.
Figure 3. Data-model at the CIM level.
3.4.2 PIM level
The conceptual data-model is refined at the PIM level. So we add
the details to the logical model without worrying about how they
will be implemented. For example Data type can be added to the
diagram at this level (figure 4).
5. ACKNOWLEDGMENTS
This work was partially funded by the ASICOM project. This
project started in April 2008 was approved by two French poles of
competitiveness: PICOM in Trade Industries domain and
Nov@log in Logistics domain.
6. REFERENCES
[1] IEEE, 1990. IEEE (Institute of Electrical and Electronics
Engineers): Standard Computer Dictionary- A Compilation
of IEEE Standard Computer Glossaries.
[2] Jean-Pierre Lorre, Yiannis Verginadis, Nikos Papageorgiou,
and Nicolas Salatge. 2010. Ad-hoc Execution of
Collaboration Patterns using Dynamic Orchestration.
Enterprise interoperability IV 2010, Part I, 3-12, DOI:
10.1007/978-1-84996-257-5_1.
Figure 4. Data-model at the PIM level.
3.4.3 PSM level
The Unified Modeling Language has become a standard objectoriented system modeling language and is supported by major
corporations. Thus it can be used for object-relational database
modeling. There are many techniques for transforming UML
models to object-relational database systems, as discussed in [28].
Those techniques focused on transformations and are suited to be
used with the Model Driven Development (MMD) approach.
[3] ATHENA. Model-Driven Interoperability (MDI)
Framework, http://www.modelbased.net/mdi/framework.html
[4] Jean-Pierre Bourey, Reyes Grangel, Guy Doumeingts, Arne
J. Berre, Report on Model Driven Interoperability. Technical
Report, INTEROP, 2007. http://interopvlab.eu/ei_public_deliverables/interop-noe-deliverables.
At this level details are added to the PIM models (UML class
diagrams) to adapt it to a specific platform (i.e.: relational
database). Figure 5 shows a very simple example from the
database model diagram of the ASICOM project. This diagram
includes concepts as tables, columns, views, and foreign keys.
[5] Fenglin Han, Espen Moller, Arne.J.Berre. 2009.
Organizational interoperability supported through goal
alignment with BMM and service collaboration with SoaML.
Interoperability for Enterprise Software and Applications
(268 – 274). IESA '09. International Conference, China (2122 April 2009).
[6] ATHENA, Advanced Technologies for Interoperability of
Heterogeneous Enterprise Networks and their Applications,
FP6-2002-IST-1, Integrated Project, (April. 2003).
[7] Arne-Jørgen Berre, Brian Elvesæter1, Nicolas Figay, Claudia
Guglielmina, Svein G. Johnsen, Dag Karlsen, Thomas
Knothe and Sonia Lippe. 2007. The ATHENA
Interoperability Framework. Enterprise Interoperability II,
2007, Part VI, 569-580, DOI: 10.1007/978-1-84628-8586_62.
Figure 5. Physical model at the PSM level.
4. CONCLUSION
In this paper we have introduced a new practical vision of
interoperability based on Model Driven Architecture and
ATHENA Interoperability Framework. Our research goal is not to
propose yet another approach, but combine existing ones to
[8] Jim Amsden, Modeling with SoaML, the Service-Oriented
Architecture Modeling. (January. 2010)
8
http://www.ibm.com/developerworks/rational/library/09/mod
elingwithsoaml-1/index.html
[21] Hayward, Simon. and Natis, Yefim. V. 2006. Application
Infrastructure' Reflects New Dynamics in the Software
Market. Gartner (December. 2006).
[9] Ronald Schmelzer. 2007. SOA Infrastructure Patterns and
the Intermediary Approach (July. 2007)
http://www.zapthink.com/2007/07/04/soa-infrastructurepatterns-and-the-intermediary-approach/.
[22] Johan den Haan. 2008. Architecture requirements for
Service-Oriented Business Applications, (May. 2008),
http://www.theenterprisearchitect.eu/archive/2008/05/19/arc
hitecture-requirements-for-service-oriented-businessapplications.
[10] Michael Stollberg. 2009. Integrated and tool-supported
Methodology Deliverable D2.2 – Initial Version – Work
Package 2, SHAPE Project No 216408 (January. 2009).
[23] Jihed Touzi, Frédérick Bénaben, Hervé Pingaud, Jean-Pierre
Lorré. 2009. A model-Driven approach for collaborative
service-oriented architecture design. International journal of
production economics, Volume 121, Issue 1, Pages 5-20,
Modelling and Control of Productive Systems: Concepts and
Applications Elsevier (September. 2009).
[11] Zachman, A Framework for Information Systems
Architecture, IBM Systems Journal, vol. 31, no. 3, pp. 445–
470, 1999.
[12] The Open Group Architecture Framework. 2009. TOGAF
version 9, http://www.opengroup.org/
[24] O. Zimmermann, P. Krogdahl, and C. Gee, Elements of
Service-Oriented Analysis and Design, An interdisciplinary
modeling approach for SOA projects, IBM, 2 June 2004.
http://www128.ibm.com/developerworks/webservices/library/ws-soad1/
[13] IFIP-IFAC Task Force, 1999. GERAM: Generalized
Enterprise Reference Architecture and Methodology,
Version 1.6.2, Annex to ISO WD15704, IFIP-IFAC
[14] Doumeingts G., Vallespir B., Zanettin M., Chen D. 1992
GIM: GRAI IntegratedMethodology. A methodology for
designing CIM systems. GRAI/LAP. Université-Bordeaux 1,
version 1.0.
[25] Arsanjani A., Service-oriented modeling and architecture:
how to identify, specify, and realize servIces for your SOA.
IBM whitepaper, 2004.
[15] AMICE 1993. CIMOSA: Open System Architecture for
CIM. 2nd extended revised version. Springer-Verlag, Berlin.
[26] Brian Elvesæter, Cyril Carrez, Parastoo Mohagheghi, ArneJørgen Berre, Svein G. Model-Based Development with
SoaML.2010.
http://www.uio.no/studier/emner/matnat/ifi/INF5120/v10/un
dervisningsmateriale/MDSE-SoaML-INF5120.pdf.
[16] Praxeme Institue, Version 2.0, (June.2006),
http://www.praxeme
[17] Roland Jochem. 2010. Enterprise Interoperability
assessment, 8th International Conference of Modeling and
Simulation, MOSIM’10 - Hammamet-Tunisia, (December,
2010).
[27] Frédérick Bénaben, Jihed Touzi, Vatcharaphum Rajsiri,
Sebastien Truptil, Jean-Pierre Lorré, and Hervé Pingaud.
2008. Mediation Information System Design in a
Collaborative SOA Context through a MDD Approach (June.
2008).
[18] ATHENA. 2005. D.A1.3.1: Report on Methodology
description and guidelines definition, Version 1.0, ATHENA
Integrated Project, Deliverable D.A1.3.1, (March. 2005).
[28] E.S. Grant, R. Chennamaneni, and H. Reza, Towards
analyzing UML class diagram models to object-relational
database systems transformations, Proceedings of the 24th
IASTED international conference on Database and
applications, Innsbruck, Austria: ACTA Press, 2006, pp.
129-134.
[19] Arne-Jørgen Berre, Brian Elvesæter. 2008. Model-based
System Development Part IV : MDI – Model Driven
Interoperability Notes for Course material “Model Based
System Development, INF5120 , (2008).
[20] Chen, D. Dassisti, M. and Tsalgatidou, A. 2005.
Interoperability Knowledge Corpus, An intermediate Report,
Deliverable DI.1, Workpackage DI (Domain of Interoperability), INTEROP NoE, (November. 2005).
9
Semantic Interoperability of Clinical Data
Idoia Berges
Jesus Bermudez
Alfredo Goñi
University of the Basque
Country
P. Manuel de Lardizabal 1
Donostia-San Sebastian,
Spain
University of the Basque
Country
P. Manuel de Lardizabal 1
Donostia-San Sebastian,
Spain
University of the Basque
Country
P. Manuel de Lardizabal 1
Donostia-San Sebastian,
Spain
idoia.berges@ehu.es
jesus.bermudez@ehu.es
Arantza Illarramendi
alfredo@ehu.es
University of the Basque
Country
P. Manuel de Lardizabal 1
Donostia-San Sebastian,
Spain
a.illarramendi@ehu.es
ABSTRACT
the links among the terms of the upper and lower levels of
the ontology. It obtains a declarative mapping specified in
OWL2 and puts a wide range of mapping scenarios within
reach of health systems’ administrators.
The use of Electronic Health Records (EHRs) has brought
multiple benefits to the healthcare domain. However, those
advantages would be greater if seamless interoperability of
EHRs between heterogeneous Health Information Systems
were achieved. Nowadays, achieving that kind of interoperability is on the agenda of many national and regional
initiatives, and in the majority of the cases, the problem is
addressed through the use of different standards.
In this paper we present a proposal that goes one step
further and tackles the interoperability problem from a formal ontology driven perspective. So, our proposal allows
one system to interpret on the fly clinical data sent by another one even when they use different representations. We
present in the paper the three key components of the proposal: 1. An ontology that provides –in its upper level–
a canonical representation of EHR statements, more precisely of medical observations, which can be then specialized
–in the lower level– by health institutions according to their
proprietary models, 2. A translator module that facilitates
the definition of the lower level of the ontology from the
particular EHRs data storage structures following a semiautomatic approach: first a translation process of underlying data structures, using –whenever possible– information
about properties (functional dependencies, etc.) into ontology elements described in OWL2, and next, an edition
process where the health system administrators can define
new axioms to adjust and enrich the result obtained in the
semi-automatic process. Finally we show the third component, a mapping module that helps in the task of defining
Categories and Subject Descriptors
D.2.12 [Software Engineering]: Interoperability
General Terms
Design
1. INTRODUCTION
It is of no doubt that Information Technologies are playing a relevant role in the research and improvement of the
healthcare domain. In the case of Electronic Health Records,
several advantages can be mentioned: first, legibility problems due to poor handwriting –which might lead to misunderstandings– are avoided. Moreover, EHRs hold great clinical decision support, by translating practice guidelines into
automated reminders and actionable recommendation [10],
which can lead to safer, less error-prone, less expensive and
higher-quality care. Finally, another advantage is the possibility of exchanging EHRs among different organizations.
A patient is likely to receive medical attention from several institutions over his lifetime, so it seems reasonable for
each institution to have unrestricted access at any time to
the previously recorded patient’s data. Authors in [4] have
identified certain problems that can be avoided thanks to
an effective exchange of EHRs: Communicating vital information like adverse drug reaction histories can prevent from
deaths and other serious consequences. Moreover, providing the clinicians with easy access to patients’ previous test
results eliminates unnecessary duplication of tests. Finally,
monitoring chronically ill patients, which usually requires
great costs and collaboration between many professionals
at distinct points of care becomes easier. As beneficial as
EHR interoperability may seem, nowadays it is still an unreached goal1 , mainly because Health Information Systems
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
MDI2010 October 5, 2010, Oslo, Norway
Copyright 2010 ACM 978-1-4503-0292-0/10/10 ...$10.00.
1
10
Epsos project in the European community [6]
used within the medical institutions have been developed
independently, which results in a high number of heterogeneous proprietary models for representing and recording
EHR information.
One of the most recurring approaches to solve interoperability issues is the use of standards. In the case of EHR
interoperability, several standards are under development
for this purpose, such as openEHR [16], CEN-13606 [5] and
HL7-CDA [9]. The openEHR standard follows a dual model
approach for representing EHRs: The Reference Information Model (RIM) contains basic and generic structures for
representing EHR information. Terms such as list, table or
entry are described in this level. It is a stable model which
is not expected to change over the time. However, since
the RIM is composed of a small number of classes and they
are too general to describe the semantics that clinical terms
require, another model is necessary: the archetype model.
The archetype model describes knowledge elements, such as
Heart Rate or Barthel Index that are created by using and
restricting components of the RIM.
The CEN standard also follows the aforementioned dual
model approach and provides by now a quite simple RIM
and few archetypes based on those of openEHR. Finally,
HL7-CDA has been developed by HL7 and also follows the
layered model approach. More precisely, it provides a RIM
and a draft template specification, which in this standard
represents the same idea of the openEHR archetypes. Although the idea of using a standard may seem suitable for
the desired goal, the interoperability problem remains unsolved unless these standards merge into a single one.
Moreover, in [11] three different levels of interoperability
that can be considered for EHRs are described: level 1 refers
to syntactical interoperability, level 2 to partial semantic interoperability and level 3 to full semantic interoperability.
They also express that the research effort should be nowadays oriented to the development of mechanisms that will
allow achieving full semantic interoperability, in which case
neither language nor technological differences will prevent
Health Information Systems to seamlessly integrate the received EHRs into the local model. In general, semantic interoperability is defined as the ability of one computer system to receive some information and interpret it in the same
sense as intended by the sender system, without prior agreement on the nature of the exchanged data.
In this paper we present a proposal to move towards the
notion of full semantic interoperability of EHRs of medical observations, based on semantic web technologies, and
more precisely on OWL2 [17] ontologies and corresponding
reasoners. These technologies facilitate semantic interoperation between heterogeneous information systems ([15]; [2])
as opposed to other formats for interchanging data –such
as XML– which do not deal with the semantics of the exchanged data [7]. Two general approaches for interoperability among systems are described in [12]: Using a canonical
model to which the particular systems are linked or aligning
the particular models two by two. The proposal presented
in this paper is sustained in the former approach and additionally presents the following novel contributions:
• The development of the EHROnt ontology, which represents at different levels the definitions of clinical terms
that appear in EHRs. At the Canonical level, it contains ontological definitions of EHR statements (in particular of medical observations) and at the Application
level, it contains the specializations of the definitions
of the Canonical level according to the standards mentioned previously or according to proprietary models of
health institutions (it favors the notion of extensibility
to different models).
• The management of a reasoning mechanism that, using
axioms stated in the ontology, infers knowledge that allows the discovery of more relationships among the different models used by the different Health Information
Systems (it decreases the need of human intervention).
• The provision of one module that facilitates the task of
obtaining the definitions of the lower level of EHROnt
from the particular EHRs data storage structures and
another module that facilitates the task of linking definitions of the lower level to definitions of the upper
level (it facilitates seamlessly adaptation of existing
Health Information Systems).
In the area of EHRs’ interoperability a certain number of
related works can be found at present. Among those works
closer to our proposal we can mention the following ones:
Authors in [13] provide a solution to achieve interoperability
between systems that have been developed under the HL7
RIM. However, this proposal requires that the source system has some prior knowledge about the target system and
moreover, it does not tackle the communication between systems that use proprietary EHR specifications. In [3] ontology mappings are proposed between pairs of archetype-based
models. Moreover, in [14] a software architecture that transforms one openEHR archetype into a CEN-13606 archetype
is presented. Ontologies that describe archetype models of
both standards, in addition to an integrated ontology are
used in the process. Notice that in those works, the features
of extensibility and lower grade of user intervention provided
by our framework are not supported.
In summary in this paper we show a proposal that allows
one system to interpret on the fly clinical statements sent by
another one –even when they use proprietary formats. We
support our claim on the following techniques:
• Logic-based descriptions: Representations of clinical
statements considered by particular Health Information Systems, described using standards as well as proprietary models, are expressed in our approach by using OWL2 ontology axioms. Moreover, terms in those
axioms are related with canonical ontology terms that
focus their descriptions on language and technology
independent aspects. This approach increases the opportunities of solving the interoperability issue since it
relies mainly on semantic aspects.
• Automated reasoning: All ontology descriptions, as
well as the mappings among elements of the ontology,
are expressed in the same formalism OWL2. This uniform representation allows the use of well-known reasoners in order to derive new axioms from the existing
ones. Furthermore, the mismatch problem is avoided
and automatic integration is facilitated.
• The use of formal ontologies as canonical conceptual
model, which allows to focus on aspects that are independent of the languages or technologies used to describe EHRs (it favors the notion of semantic interoperability).
11
which the observation was taken; and the state of the patient,
which is intended to record the state of the subject of the
observation during the observation process. On the other
hand, composite observations are composed of two or more
observations, either simple or composite. They are intended
to represent observations of phenomena such as the Glasgow Coma Scale (GCS) value –which is calculated as the sum
of the values obtained from three simple observations: the
Eye Response (EyeR), the Motor Response (MotorR) and the
Verbal Response (VerbalR)– or the more complex Revised
Trauma Score (RTS), a physiological scoring system for predicting death taking into account three measures: the aforementioned Glasgow Coma Scale value, the Systolic Blood
Pressure (SysBP) and the Respiration Rate (RespRate).
Below, we present some OWL2 axioms that represent classes
of medical observations.
• Transfer mechanism: A process, guided by the previous two items, is implemented to transform a particular clinical statement from a health institution into
a corresponding clinical statement for another health
institution.
In the rest of the paper we present briefly first, the main
features of the EHROnt ontology developed for representing different kinds of medical observations. Then, the main
characteristics of the translator and mapping modules are
presented in sections 3 and 4 respectively. We finish with
some conclusions.
2. CANONICAL REPRESENTATION OF
MEDICAL OBSERVATIONS
In general, an EHR includes clinical statements such as
observations, laboratory tests, diagnostic imaging reports,
treatments, therapies, administered drugs and allergies. The
different standards mentioned in the previous section reflect
those kinds of statements in one or another way. Formally,
a clinical statement is an expression of a discrete term of
clinically related information that is recorded because of its
relevance to the care of a patient [8]. In this paper we focus
on the exchangeability of medical observations statements,
which are used to record all notionally objective observations of phenomena and patient-reported phenomena, such
as physical examinations, laboratory results or basic information about the patient (weight, sex,...).
We advocate for representing those observations in one ontology called EHROnt. That ontology is made up of two layers (Canonical layer and Application layer ) that attempt to
collect observation statements at different levels of abstraction. This division into layers allows a clearer visualization
of the ontology, but it does not imply a technical division of
it. The elements of the Canonical layer should be designed
by experts in the medical field and they should be considered as a framework agreement. Moreover each element of
the Canonical layer may be associated to its corresponding
SNOMED code [19]. The elements of the Application layer,
describe the medical observations as they are understood in
the specific e-health systems. While the Canonical layer will
be the same in all versions of EHROnt, the Application layer
will be proper to each system. Thus, each health institution
will be responsible for creating this layer and relating it to
the Canonical layer, using the tools that we have developed
to help this process and which will be described in sections
3 and 4, respectively. The representation of the statements
described in the EHR standards also belongs to the Application layer. In the EHROnt ontology, the elements that
compose EHRs are described as classes and properties using
the OWL2 language.
Moreover, in the Canonical layer we propose a subdivision of medical observations into two groups depending on
their complexity: simple observations and composite observations. Simple observations have a single value and unit
of measurement. Additionally, we have also identified three
properties that may be relevant at the time of characterizing an observation: the protocol, which records information
about how the observation process was carried out, either
by indicating a particular clinical protocol (e.g. the Balke
protocol for treadmill graded exercise testing) or the medical
device used for taking the measurement (e.g. a stethoscope);
the anatomical site, to indicate the specific body location in
Observation
≡
Simple_Obs ⊔ Comp_Obs
Simple_Obs
≡
=0 comp
Simple_Obs
⊑
=1 value ⊓ ≤ 1 unit ⊓ ≤ 1 protocol.Protocol ⊓
Comp_Obs
≡
≥ 2 comp.Observation
RTS
≡
Comp_Obs ⊓ ∃comp.GCS ⊓ ∃comp.SysBP
GCS
≡
Comp_Obs ⊓ ∃comp.EyeR ⊓ ∃comp.VerbalR
EyeR
⊑
Simple_Obs
VerbalR
⊑
Simple_Obs
MotorR
⊑
Simple_Obs
SysBP
⊑
Simple_Obs
RespRate
⊑
Simple_Obs
∀state.State ⊓ =1 site.AnatomicalSite
⊓∃comp.RespRate
⊓∃comp.MotorR
Additional axioms may exist that associate classes of medical observations to SNOMED codes:
RTS
≡
owl:hasValue snomed.{’273885003’}
(1)
EyeR
≡
owl:hasValue snomed.{’281395000’}
(2)
In addition to the EHROnt ontology, our framework also
uses three auxiliary domain ontologies. As it was pointed
out previously, there are three relevant properties that often characterize observations: the protocol, the anatomical
site and the state of the patient. As a result, one Protocol ontology is necessary to represent this information in a
controlled way. We advocate for using an ontology that comprises classes from the Device and Procedure categories of
SNOMED-CT. Moreover, in order to represent anatomical
information, the Foundational Model of Anatomy ontology
[18] is suggested. Finally, one ontology has been developed
for describing information about the state of the patient,
such as the level of exertion (low, medium, high intensity)
or the position of the patient (standing, sitting,...). It is up
to the particular systems whether to use these same auxiliary ontologies or to choose other ones. In the latter case,
mappings with the proposed auxiliary ontologies should be
created.
Finally, our ontology driven approach can present some
similarities with the Knowledge Discovery Metamodel (KDM)
notion used in the Architecture Driven Modernization (ADM)
[1]. In our case, knowledge is obtained from existing data
sources.
12
3. TRANSLATOR MODULE
Each health institution has its own information system
and in the majority of the cases it deals with a proprietary EHR representation. However the interoperability opportunities increase if an ontological representation of the
proprietary representations is obtained, because the shared
logic-based representation allows formal inference of implicit
knowledge. For that reason we have developed a translator
module that is in charge of building the Application layer
of the EHROnt ontology for each proprietary information
system. In many cases this module will receive as input a
relational database schema but in other cases it may receive
schemata for semi-structured data sources or plain files.
The output of the translator module is a description mapping D = hS, O, Mi that consists of a source schema S,
a set of OWL2 axioms O that comprises the Application
layer corresponding to the source S, and a valid mapping
M. The set of ontology axioms O is the semantic description of source S and the third component M is a set of
correspondences of the form hC, CSi, hP, P Si, where C, P ,
respectively, are class and property names appearing in O,
and CS, P S are sentences, expressed in an appropriate language for the source schema S, that define sets of ground
values.
We can consider a universal domain of interpretation ∆
and then an extension function ε that associates a set CS ε ⊆
∆ to every CS sentence, and associates a correspondence
P S ε ⊆ ∆ × ∆ to every P S sentence. The universal domain
∆ represents the real world objects of an actual extension
of the considered source S.
Given some basic correspondences of the form hC, CSi,
hP, P Si (let us write M(C) = CS, M(P ) = P S), it is
straightforward to define compositionally the correspondences
for class expressions Cexp and property expressions P exp
(let us write M(Cexp) and M(P exp)), following the same
technique as interpretation definitions in description logics.
Then, we say that a set of correspondences M satisfies a
OWL2 axiom C ⊑ Cexp if M(C)ε ⊆ M(Cexp)ε . Analogously for P ⊑ P exp. Notice that any equivalence axiom
(using ≡) can be expressed as a pair of subsumption axioms
(using ⊑ and ⊒).
We say that M is a valid mapping if its correspondences
satisfy the axioms in O for any possible extension of the
source schema S.
The translation process is divided into two main steps: a
semi-automatic one and an edition one.
Semi-automatic process
We present this step for the case of having a relational
database schema as input. In fact this case is the most
complete from the translation perspective.
First of all, relations of the relational schema are translated into OWL2 classes, and attributes into properties that
have as domain the class related to the relation in which it
is defined and as a range the type of the attribute. Moreover
integrity constraints are translated into descriptions associated with the properties.
Once the previous task is accomplished, the next one involves enriching the obtained descriptions by using information about dependencies (inclusion, exclusion and functional dependencies), null values and semantic properties
(that correspond to domain information for attribute values). This type of information is provided most of the time
by the health systems’ administrators, because it is rarely
available in the database system. Health systems’ administrators are supposed to be technically prepared people who
have a deep knowledge of the source information system.
All the previous types of properties are applied in the
following sequence: first inclusion properties; then when the
input relational schema is not in second or third normal
form, functional dependencies are used to create new classes;
next exclusion dependencies are exploited and last integrity
constraints and domain information for attribute values are
considered.
For example, a particular registration for Revised Trauma
Score values may consist of two relational tables according
to the following schema:
RTS-Table(code, RR, SBP, GCS, total)
GCS-Table(code, ER, MR, VR)
RTS-Table.GCS ⊆ GCS-Table(code)
Then, some axioms obtained using the mentioned inclusion property, for the Application layer for that information
system, are the following:
sa:RTS
≡
∃sa:hasRR.sa:RR ⊓ ∃sa:hasSBP.sa:SBP
sa:GCS
≡
∃sa:hasER.sa:ER ⊓ ∃sa:hasMR.sa:MR
⊓∃sa:hasGCS.sa:GCS ⊓ ∃sa:hasTotal.float
⊓∃sa:hasVR.sa:VR
Edition process
The goal of this step is to permit the health system administrator create the Application layer of the ontology in a flexible way. The administrator can choose to start from scratch
or from the ontological definitions obtained using the semiautomatic module. In any case the health system administrator can add new axioms to obtain the desired result. For
example, the edition process can be used to assign SNOMED
codes to the classes created by the semi-automatic process.
sa:RTS
≡
owl:hasValue sa:hasSnomed.{’273885003’}
(3)
sa:ER
≡
owl:hasValue sa:hasSnomed.{’281395000’}
(4)
In summary the translator module obtains semantic descriptions of proprietary formats used to represent EHRs,
and it has to capture –with the health system administrator’s collaboration– semantics that are hidden, in order to
make them explicit.
4. MAPPING MODULE
This module is in charge of managing the mappings between the terms of the Application layer and the terms of
the Canonical layer.
In our context, an integration mapping is a structure I =
hO, G, Mi where O is a set of OWL2 axioms that comprises
the Application Layer corresponding to a health care institution, G is the set of OWL2 axioms for the Canonical Layer,
and M is a set of mapping axioms of the form C ⊑ Gexp,
C ⊒ Gexp, C ≡ Gexp where C is a class name from O, and
Gexp is a OWL2 class expression using only terms from G.
Furthermore, M may include generalized property inclusion
axioms as provided by OWL2, as well as pathMappings, that
relate one path in the Application Layer with another path
13
in the Canonical layer. A path is a valid composition of
properties.
The Mapping module receives as input a set of basic mapping axioms, specifically defined by the system administrator, that relate classes or properties of both layers, such as:
sa:hasSnomed
≡
snomed
[3]
(5)
[4]
These basic mapping axioms are incorporated into the ontology and, with the help of a reasoner, new relationships
between terms in the Application layer and those in the
Canonical layer are inferred. For instance, applying the basic mapping axiom 5 to axioms 3 and 4 infers:
sa:RTS
≡
owl:hasValue snomed.{’273885003’}
(6)
sa:ER
≡
owl:hasValue snomed.{’281395000’}
(7)
[5]
[6]
[7]
and consequently, applying axioms 1 and 2 from the Canonical layer (see section 2) the equivalence mappings sa:RTS ≡
RTS and sa:ER ≡ EyeR are obtained. All those mappings are
expressed through OWL2 axioms that put a wide range of
mapping scenarios within reach of health systems’ administrators.
Following with the process, the Mapping Module checks
whether some path mappings may exist. It is captured from
the definition of sa:RTS in the Application layer that there
is a path sa:hasGCS·sa:hasER from class sa:RTS to class
sa:ER. Moreover, it is captured from the Canonical layer
that there is a path comp·comp from class RTS to class ER.
Since the Mapping Module has already discovered an equivalence mapping between the source classes of both paths
(sa:RTS ≡ RTS) and also another equivalence mapping between their target classes (sa:ER ≡ EyeR), the Mapping
Module suggests that there may be a path mapping between
those paths. The system administrator may then either accept or delete the suggested path mapping.
[8]
[9]
[10]
[11]
[12]
[13]
5. CONCLUSION
The use of Electronic Health Records has brought several advantages to the healthcare domain. However there is
still much work to do regarding certain issues such as EHR
interoperability. We have presented one approach that supports the notion of interoperability of medical observations
sustained in two techniques: one, logic-based ontology descriptions of EHRs statements as well as of mappings defined
among elements of the ontologies and two, automated inference on ontology descriptions.
[14]
[15]
6. ACKNOWLEDGMENTS
This work is supported by the Spanish Ministry of Education and Science (TIN2007-68091-C02-01) and the Basque
Government (IT-427-07). The work of Idoia Berges is also
supported by the Basque Government (Programa de Formación de Investigadores del Departamento de Educación,
Universidades e Investigación).
[16]
[17]
[18]
7. REFERENCES
[1] Architecture-Driven Modernization, 2010. Available at
http://adm.omg.org.
[2] I. Berges, J. Bermudez, A. Goñi, and A. Illarramendi.
Semantic Web Technology for Agent Communication
Protocols. In Proceedings of the 5th European
[19]
14
Semantic Web Conference (ESWC 2008), pages 5–18,
Tenerife, Spain, 2008.
V. Bicer, O. Kilic, A. Dogac, and G. B. Laleci.
Archetype-Based Semantic Interoperability of Web
Service Messages in the Health Care Domain. Int’l
Journal on Semantic Web & Information Systems,
1(4):1–22, 2005.
L. Bird, A. Goodchild, and Z. Z. Tun. Experiences
with a Two-Level Modelling Approach to Electronic
Health Records. Journal of Research and Practice in
Information Technology, 35(2):121–138, 2003.
EN 13606-1: Electronic Health Record
Communication, 2007.
The Epsos project. http://www.epsos.eu/.
J. Hefflin and J. Hendler. Semantic Interoperability on
the Web. In Proceedings of Extreme Markup
Languages 2000, pages 111–120. Graphic
Communications Association, 2000.
HL7 Version 3 Standard: Clinical Statement Pattern,
Release 1. Available at
http://www.hl7.org/v3ballot/html/domains/uvcs/uvcs.htm.
HL7-CDA, 2009. Available at http://www.hl7.org.
L. Hoffman. Implementing Electronic Medical
Records. Communications of the ACM, 52(11):18–20,
nov 2009.
D. Kalra, P. Lewalle, A. Rector, J. M. Rodrigues,
K. A. Stroetmann, G. Surjan, B. Ustun, M. Virtanen,
and P. E. Zanstra. Semantic Interoperability for
Better Health and Safer Healthcare. Technical report,
European Commission, Jan. 2009.
V. Kashyap and A. P. Sheth. Semantic and schematic
similarities between database objects: A context based
approach. The Very Large Databases Journal,
5(4):276–304, 1996.
O. Kilic and A. Dogac. Achieving Clinical Statement
Interoperability using R-MIM and Archetype-based
Semantic Transformations. IEEE Transactions on
Information Technology in Biomedicine, to appear.,
2009.
C. Martı́nez-Costa, M. Menárguez-Tortosa,
R. Valencia-Garcı́a, J. Maldonado, and J. T.
Fernández-Breis. Transformación Automática de
Arquetipos UNE-EN 13606 y openEHR para Facilitar
la Interoperabilidad Semántica. In Inforsalud 2009,
Madrid, Spain, mar 2009.
L. Obrst. Ontologies for Semantically Interoperable
Systems. In Proceedings of the 2003 ACM CIKM
International Conference on Information and
Knowledge Management, pages 366–369, New Orleans,
Louisiana, USA, nov 2003. ACM.
openEHR, 2009. Available at http://www.openehr.org.
OWL2 Web Ontology Language.
http://www.w3.org/TR/2009/REC-owl2-overview20091027/.
C. Rosse and J. L. V. Mejino. A Reference Ontology
for Biomedical Informatics: the Foundational Model
of Anatomy. Journal of Biomedical Informatics,
36:478–500, 2003.
SNOMED, 2009. Available at
http://www.ihtsdo.org/snomed-ct/.
A Process Model Discovery Approach for
Enabling Model Interoperability in Signal Engineering
Wikan Danar Sunindyo, Thomas Moser, Dietmar Winkler, Stefan Biffl
Christian Doppler Laboratory for Software Engineering Integration for Flexible Automation Systems
Vienna University of Technology
Favoritenstrasse 9-11/188 1040 Vienna, Austria
+43 588 01 - 18801
{wikan,moser,winkler,biffl}@ifs.tuwien.ac.at
ABSTRACT
1. INTRODUCTION
In automation systems engineering, signals are considered as
common concepts for linking information across different engineering disciplines, such as mechanical, electrical, and software
engineering. Signal engineering is facing tough challenges in
managing the interoperability of heterogeneous data tools and
models of each individual engineering discipline, e.g., to make
signal handling consistent, to integrate signals from heterogeneous data models/tools, and to manage the versions of signal
changes across engineering disciplines. Currently, signal changes
across engineering disciplines are primarily managed manually
which is costly and error-prone. The main contribution of this
paper is the signal change management process model as an input
for semantic integration of engineering tools and models to support (semi) automated signal change management. Major result
was that the process model discovery approach well supports the
discovery of semantic integration requirements across heterogeneous engineering tools and models more efficient compared to
the manual signal change management.
Complex automation systems, like power plants or car manufacturing workshops, typically involve several different engineering
disciplines, e.g., mechanical engineering, electrical engineering,
and software engineering that should collaborate to achieve their
goals. In such complex automation systems, stakeholders from
different engineering fields usually apply individual and discipline-specific tools and models for task execution. Nevertheless,
information sharing and collaboration across disciplines and data
exchange are pre-conditions for successful project execution.
Thus it exists the need for interoperability between different tools
and models of such complex automation systems. Currently, a lot
of research is done on achieving interoperability between heterogeneous systems and notations [6, 9, 13]. However, most of the
approaches are still facing the difficulties involved in overcoming
their differences, the lack of consensus on common required standards, and the shortage regarding proper mechanisms and tools [7,
11].
Results of our observation in industry identified signals are common concepts in complex automation systems that link information across different engineering disciplines, e.g., mechanical
interfaces, electrical signals (wiring), and software I/O variables.
The application field called “Signal engineering” deals with managing signals from different engineering disciplines and is facing
some important challenges, e.g., (1) to make signal handling consistent, (2) to integrate signals from heterogeneous data models/tools, and (3) to manage versions of signal changes across
engineering disciplines.
Categories and Subject Descriptors
D.2.9 [Software Engineering]: Management – software configuration management, software process models (e.g., CMM, ISO,
PSP).
D.2.12 [Software Engineering]: Interoperability
I.6.5 [Simulation and Modeling]: Model Development – Modeling methodologies.
To overcome these challenges, one needs to define an interoperability model that illustrates the signal data models and tools from
each engineering field as well as their interactions. However,
manual design of an interoperability model from different engineering fields is costly and error prone. In the manual model design, all models and required information have to be collected
from the domain expert of each engineering fields. Then, the domain expert needs to create the model and its interactions based
on the different models collected and cross check with each
stakeholder whether the model is correct and the interactions between different engineering fields are correct also. One should do
this work and refinement repetitively to obtain conflict-free models. Sometimes, it is quite hard to get a final model that fulfills the
requirement from every party, since the requirements themselves
could change over the time.
General Terms
Management, Design.
Keywords
Signal Change Management, Model Interoperability, Automation
Systems Engineering.
Permission to make digital or hard copies of all or part
of this work for personal or classroom use is granted without
fee provided that copies are not made or distributed for profit
or commercial advantage and that copies bear this notice and the
full citation on the first page. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
MDI2010, October 5, 2010, Oslo, Norway.
Copyright 2010 ACM 978-1-4503-0292-0/10/10...$10.00.
The main contribution of this paper is the proposition of a process
model discovery approach to identify the process model for a
15
manage signal change and enhance decision making characteristics of PLM. VISE is a highly integrated application of recent
CAD/CAM, human computer, collaborative, product data management, Internet portal, and intelligent information processing
techniques in PLM system. The authors introduce the concept of
change affect zones (CAZ). CAZ comprises a set of engineering
objects on which a change may have any effect. Objects in an
affect zone may be both inside and outside of a virtual space. So,
the new changes/modifications or conflicts will be handled in
CAZ before they are executed.
exemplary signal change management process and find out the
requirements of semantic integration between heterogeneous data
models and tools. By using this approach we are able to discover
the interoperability model based on the actual data. This model
can be useful for illustrating the interactions between engineering
fields and detecting the needs of semantic integration in the signal
change management. Major results show that by using the process
model discovery approach, the requirements of semantic integration across heterogeneous tools and data models from different
engineering fields can be discovered efficiently. This model can
support further semantic integration and interoperability of the
models, e.g., by using the Engineering Knowledge Base (EKB)
approach [4, 12].
2.2 Process Modeling and Analysis
Process analysis approaches focus on analyzing (engineering)
process data collected during the systems operation. Process
analysis approaches have been applied to some types of complex
systems, for example workflow management systems, Enterprise
Resource Planning (ERP), and Customer Relationship Management (CRM) systems. Van der Aalst et al. [16] used workflow
technologies to illustrate the structure of the operational processes
of a system. Workflow technology provides event data that could
be useful for process analysis in software engineering (SE) by
enabling particular models that link basic tool events to process/workflow events [16].
The remainder of this paper is structured as follows. Section 2
summarizes related work on signal change management, semantic
integration and process modeling and analysis. Section 3 identifies the research issues. Section 4 develops the solution approach
to discover model for signal change management in complex
automation systems. Section 5 describes the evaluation based on
signal change management processes. Section 6 discusses benefits
and limitation of model discovery approach; and finally section 7
concludes the paper.
Van der Aalst et al. [16] also used stored events, which refer to
tasks and process cases originating from people/tools/systems, to
monitor and analyze real workflows with respect to designed
workflows. This approach is called process mining and can be
used for process discovery, performance analysis, and conformance checking. The approach has been implemented in the open
source tool ProM2 and can be used to discover the process model
based on the available event log, analyze the performance of the
processes, and suggest possible process improvement candidates.
2. RELATED WORK
This section summarizes related work on signal change management, semantic integration technologies and process analysis
approach as ways to build models for heterogeneous engineering
areas.
2.1 Signal Change Management
According to the Meriam Webster dictionary1, a signal can be
defined as an object used to transmit or convey information. In
this paper we define a signal as a common concept for linking
information between disciplines. Thus, signals are not limited to
electrical signal (wiring) in electrical engineering, but also include mechanical interfaces in mechanical engineering and software I/O variables in software engineering. In complex automation systems, we define relationships between different kinds of
signals from different engineering fields and use them to collaborate and communicate.
Ferreira and Ferreira [8] proposed a reusable workflow engine
based on Petri Net theory as a basis for workflow management.
They introduced the workflow kernel, a prototype implementation
of common workflow functionality which can be abstracted and
reused in systems or embedded in applications intended to become workflow-enabled. The workflow kernel is based on common workflow functionality from several workflow engines,
while the Petri net theory can be used as a process representation
language for process analysis.
Formerly, domain experts used manual change management approaches like in [1] to manage signal changes between different
engineering fields. Manual changes use documents to manage
changes between different engineering fields in the system. By
using a primarily manual approach, the researchers collect the
signal lists from each engineering field and then connect relationships between different engineering fields manually. If there is
any signal change in one document, then the change has to be
mapped to the relationship document and all relevant stakeholders
have to find out which other signals in different engineering fields
could be affected with this change. Manual change handling is
costly and error prone. Thus, signal change handling automation
is a promising research area to improve product and process quality.
Sunindyo et al. [15] proposed an approach to monitor, analyze,
and improve tool-based engineering processes. Main idea is to
generate an interoperability model based on event-based process
analysis activities to link heterogeneous software engineering
tools.
2.3 Semantic Integration
Semantic integration is an approach to solve problems from an
intention to share data across disparate and semantically heterogeneous data [9], which are including (a) the detection of duplicate entries, (b) the matching of ontologies or schemas, (c) the
Research on signal change management in product lifecycle management (PLM) context is done by e.g., Horvath and Rudas [11].
They propose a virtual intelligent space for engineering (VISE) to
1
2
http://www.merriam-webster.com
16
http://www.processmining.org
Figure 1. Challenges in Signal Engineering.
with the new tools and data formats that make their work even
more difficult.
modeling of complex relations in different data sources, and (d)
the reconciliation of inconsistencies [13]. One of the most important and the most actively studied problems in semantic integration is how to establish semantic correspondences (mapping) between vocabularies of different data sources [6]. Hence, the application of ontologies as semantic web technologies to manage
knowledge in specific domains is inevitable. There are five reasons to develop ontology, i.e., (a) to make domain assumptions
more explicit, (b) to share common understanding of the structure
of information among software agents or people, (c) to enable
reuse of domain knowledge, (d) to analyze domain knowledge,
and (e) to separate domain knowledge from the operational
knowledge [14].
Other challenges in the signal change management include how to
integrate the signal data originating from heterogeneous data
models and tools. Figure 1 shows the requirements of mechanical
engineers, electrical engineers and software engineers to share
related signal data. The mechanical engineer uses different formats of data than the electrical engineer and the software engineer
do. The challenge is how to integrate signals from heterogeneous
data models/tools (1). By using a so-called “virtual common data
model” [12], the different engineers can share their related data
from electrical to mechanical signals and to the software variables. The “virtual common data model” becomes a foundation
for mapping proprietary tool-specific engineering knowledge and
more generic domain-specific engineering knowledge to support
transformation between related engineering tools. It is “virtual”
because actually there is no need to provide a separate repository
to store the common data model. The management of the common
data model with respect to different engineering fields is done via
a specified mapping mechanism. The mechanism of the “virtual
common data model” approach includes 5 steps: (a) Extraction of
tool data from each engineering field; (b) Storage of extracted
tool data into its own model; (c) Description of the tool knowledge for each engineering field’s tool: (d) Description of common
domain knowledge: (e) Mapping of tool knowledge to the common domain knowledge. This work should be done carefully to
obtain a complete list of signal mappings from the electrical to the
mechanical and the software engineers. In real systems, stakeholders could also include people from other engineering fields.
Moser et al. [12] introduced the Engineering Knowledge Base
(EKB) framework as a semantic web technology approach to
address challenges from data heterogeneity which is applied in the
production automation domain [12]. Biffl et al. [4] also used the
EKB framework for solving similar problem in the context of
Open Source Software Projects. This EKB framework is applicable to solve semantic heterogeneity problems in other automation
engineering systems.
3. RESEARCH ISSUES
Complex automation systems, like power plants, need to handle a
high amount of data, e.g., up to 40,000 signals originating from
different engineering fields. Stakeholders need to manage these
signals to enable signal data consistency within the project. Thus,
efficient and effective signal data management approaches are
required to handle signal changes properly. In addition, individual
engineers may not pay attention to signal data management but
keep focused on their individual engineering work within their
discipline, i.e., engineers from different fields don’t have to deal
This semantic integration challenge can be solved for example by
applying semantic integration approaches like the Engineering
Knowledge Base (EKB) framework [4, 12]. Other challenges are
17
to manage version of signal changes across engineering disciplines and to manage common concepts based on the semantic
integration (2). The research question is how to discover the process model from the actual data provided by heterogeneous engineering fields? Based on this research question, we can discover
the structure across heterogeneous data models/tools and their
interactions and we can identify the need for the semantic integration to link heterogeneous data models and tools.
5. RESULTS
For discovering the interoperability model for signal change management processes in the design time and runtime of complex
automation systems, we collect process event data from each
engineering field, e.g., electrical, engineering, and software engineering. By using the ProM tool, we conduct an analysis to discover the underlying process model by applying the Alpha Algorithm [5] to the collected data. The work of Alpha Algorithm is
based on discovering transitions which are causally related between different event traces. From collected event log data as an
input, we can discover a set of related transition from all event
traces. For each tuple (A,B) in this set, each transition in set A
causally relates to all transitions in set B, and no transitions within
A (or B) follow each other in some firing sequence. We refine the
set by taking only the largest elements with respect to set inclusion.
Linking heterogeneous disciplines can enable a so-called end-toend test (see Figure 2) to trace signals from hardware sensors to
software variables across system borders. This approach support
defect detection during development and changes.
Figure 2. Interaction between different engineering fields.
Figure 2 shows the interaction between different engineering
fields in managing the signal changes. Three different engineers,
namely mechanical engineer, electrical engineer, and software
engineer, typically share a lot of signals that are connected to
each other. These relationships should be maintained in such an
Engineering Knowledge Base, such that when some changes happen in one engineering field, they can be propagated to other engineering field automatically or semi-automatically.
The evaluation is done by comparing the manual signal change
management process and the automated/semi-automated signal
change management process after applying the process model
discovery approach to reveal semantic requirements in engineering processes across different engineering fields.
Figure 3. Manual Signal Change Management.
4. USE CASE
The output is a workflow net that connected each event trace to
other related event traces via transitions [5]
To show how to manage interoperability between engineering
tools in the complex automation systems, we use a signal change
management use case from mechanical to electrical and software
engineer. Figure 3 illustrates how to merge different signals (and
changes) and resolve conflicts between signals coming from different disciplines manually. The conceptual steps are as follows:
Error! Reference source not found. shows the results of the
model discovery analysis by using the Alpha Algorithm [5]. Here
we have 4 different scenarios of the process model of the signal
change management process.
(1) no conflict: the mechanical engineer executes changes and
performs a manual difference analysis to other engineering
fields via interaction between mechanical engineering plan,
electrical plan, and software development environment. The
mechanical engineer manually propagates changes to
(1) The mechanical engineer executes changes in the mechanical plan that will also affect the tool data.
(2) The mechanical engineer manually make difference analysis
for interaction with other engineering tools, to check
whether there is any conflict with other engineering tools
data.
(3) The mechanical engineer makes manual propagation to mechanical engineering tools and software engineering tools.
(4) The mechanical engineer and the software engineer execute
changes in their mechanical plan and software development
environment.
18
Figure 4. Signal Change Management Processes Model.
ASB via connector components, which allows addressing all
deployed components as services via the ASB. The ASB integrates components in both office-like design and onsite environments with a common integration architecture but different
implementations [3]. In signal change management, the different
tools to manage different signals from heterogeneous engineering fields are connected to the ASB via connector components.
Each tool is treated as a component. The communication between components is also managed by the ASB, so when there
is a signal changed in one tool it will be communicated via ASB
and distributed to other tools automatically.
other tools. The electrical engineer and software engineer
execute changes in their environments.
(2) normal conflict: after manual difference analysis the mechanical engineer starts managing conflicts and resolves
the conflicts, by modifying the old signal with the new
signal. If the conflicts are resolved, the mechanical engineer transforms the change to other engineering fields.
(3) critical conflict: almost the same as the normal conflict.
The difference is in the action after managing the conflict
is over. The mechanical engineer has to remove the signal
and send a notification to the electrical engineer and software engineer. The electrical engineer and software engineer will consider this as a critical conflict and decide
whether to accept the signal removal or reject it.
The EKB is a semantic-web-based framework, which supports
the efficient integration of information originating from different expert domains without a complete common data schema
[12]. The EKB framework stores the engineering knowledge in
ontologies and provides semantic mapping services to access
design-time and run-time concepts and data. The EKB framework aims at making tasks, which depend on linking information across expert domain boundaries, more efficient [12]. The
EKB is connected to other tools via the ASB. In the signal
change management, the EKB plays a role as semantic integration between different signal data from heterogeneous engineering fields. Each signal is stored in the ontology as a base of
EKB together with its relationships to other signals. The changing of signal in the ontology means the modification of the signal entity in the ontology and its relationship.
(4) looping condition: if the electrical engineer and software
engineer reject the signal removal, then there will be any
option to argue on signal change on the electrical engineers side. Hence, the situation loops back to the condition
before the change is transformed to other engineering
fields.
From Figure 4, we can suggest for signal change management
process improvement, by collecting and integrating the heterogeneous signal data models and tools from different engineering
fields using automation service bus (ASB) [3] and EKB [4, 12].
ASB technically integrates heterogeneous tools while the EKB
semantically integrate the heterogeneous data models of electrical engineers, mechanical engineers, and software engineers.
The result of the signal change management improvement can
be seen in Figure 5. It shows the usage of ASB and EKB to improve the signal change from mechanical engineer to electrical
engineer and software engineer. (1) The mechanical engineer
executes change in his mechanical plan. (2) The mechanical
engineer checks in the change and makes difference analysis by
using ASB and EKB. (3a & 3b) The electrical engineer and
software engineer check out changes from ASB and EKB.
ASB is an approach similar to the “Enterprise Service Bus” in
the business IT context [10] for complex automation systems
engineering. The current “Enterprise Service Bus” approach is
applied in the business IT context and the most of its implementations are making some design assumptions, e.g., service will
always be online and resources (computing, network bandwidth,
memory) are not the main issues of the design. These assumptions are not suitable with the requirements of the signal change
management. Thus the ASB has to be designed more lightweight and be able to bridges technical gaps between engineering processes, models and tools for quality and process improvements [2]. Engineering components are connected to the
19
7. CONCLUSION AND FURTHER WORK
Collaboration and interaction between different engineering
fields are critical issues in heterogonous engineering environments because individual disciplines apply different tools and
data models. This heterogeneity hinders efficient collaboration
and interaction between various stakeholders, e.g., mechanical,
electrical, and software engineers. Semantic integration based
the purposed model enables data exchange based on common
concepts, e.g., signals, and increase collaboration efficiency and
effectiveness. In addition, process observation based on event
data is a promising approach for (a) identifying the current (real)
process workflow, (b) measurement data, and (c) is the foundation for process analysis and improvement.
In this paper, we have explained the usage of a process model
discovery approach to derive the model immediately from the
actual engineering process data and identified improvement
options for increasing process quality. We applied a signal
change management process to illustrate (a) the basic concepts,
(b) semantic integration approaches, and (c) process improvement based on collected and analyzed event data.
Figure 5. Signal Change Management by using ASB & EKB.
We found that this approach is easier to be adapted in alreadyrunning systems which consist of different tools and data models
for each engineering area. This approach can also be adapted
and generalized to other model-driven interoperability systems.
6. DISCUSSION
In this section, we discuss the benefits and limitations of the
model discovery approach compare to the manual approach.
Future works will include the application of the model discovery
approach to other problem domains and exploring the idea how
to detect defects in signal change management and how to make
decision on signal change management based on prior experience. We will develop a framework to prepare process model
discovery for signal change management in different engineering fields, such that the process model discovery and other process analysis approach can be implemented more effective and
more efficient.
The benefits of the model discovery approach are as follows. (1)
The model, which is obtained from the model discovery approach, is more precise and accurate because it is generated
from actual event data from different engineering processes. (2)
The model is easier to maintain and change. If some modifications of the system happen, we can collect the new event log
data and run the process mining tool to get the latest model. (3)
This model can be used to learn and understand the whole signal
change management processes in the system. It also support
model-driven interoperability for other purposes, e.g., decision
making and signal defect detection.
8. ACKNOWLEDGMENTS
This work has been supported by the Christian Doppler Forschungsgesellschaft and the BMWFJ, Austria. This work has
been partially funded by the Vienna University of Technology,
in the Complex System Design and Engineering Lab.
The limitations of this approach are as follows. (1) We should
provide complete event log data from each engineering processes for model discovery. (2) The ProM tool has limitations in
managing the inputs format, so we should transform the processes event log data to ProM format (Mining XML).
9. REFERENCES
[1]
Akerblom, R. A management system for quality
development. Requirements, methods and traps. In
Proceedings of the Telecommunications Energy
Conference, 1997. INTELEC 97., 19th International (1923 Oct 1997, 1997).
[2] Biffl, S. and Schatten, A. A Platform for Service-Oriented
Integration of Software Engineering Environments. In
Proceedings of the Eight Conference on New Trends in
Software Methodologies, Tools and Techniques (SoMeT
09) (2009). IOS Press.
[3] Biffl, S., Schatten, A. and Zoitl, A. Integration of
heterogeneous engineering environments for the
automation systems lifecycle. In Proceedings of the 7th
From this discussion, it is possible for other model driven interoperability systems to adapt the model discovery approach to
get their process model immediately, rather than building from
the scratch and improve later via several iterations.
The alternative of the process model discovery approach is making interview sessions for each engineer from different engineering fields to acquire the requirements to make a model. This
model should be consulted between engineers to obtain an integrated view on the model from different engineering perspectives that support interoperability between different engineering
fields.
20
IEEE International Conference on Industrial Informatics
(INDIN 2009) (23-26 June 2009, 2009).
[4] Biffl, S., Sunindyo, W. D. and Moser, T. Semantic
Integration of Heterogeneous Data Sources for Monitoring
Frequent-Release Software Projects. In Proceedings of the
4th International Conference on Complex, Intelligent and
Software Intensive Systems (CISIS 2010) (2010). IEEE
Computer Society.
[5] de Medeiros, A. K. A., van Dongen, B. F., van der Aalst, W.
M. P. and Weijters, A. J. M. M. Process Mining:
Extending the alpha-algorithm to Mine Short Loops.
Eindhoven University of Technology, Eindhoven, 2004.
[6] Doan, A., Noy, N. F. and Halevy, A. Y. Introduction to the
special issue on semantic integration. SIGMOD Rec., 33,
4, 2004, 11-13.
[7] Elvesæter, B., Hahn, A., Berre, A.-J. and Neple, T. Towards
an Interoperability Framework for Model-Driven
Development of Software Systems. 2006.
[8] Ferreira, D. M. R. and Ferreira, J. J. P. Developing a
reusable workflow engine. J. Syst. Archit., 50, 6, 2004,
309-324.
[9] Halevy, A. Why Your Data Won't Mix. Queue, 3, 8, 2005,
50-58.
[10] Hohpe, G. and Woolf, B. Enterprise Integration Patterns:
Designing, Building, and Deploying Messaging Solutions.
Addison-Wesley Professional, 2003.
[11] Horvath, L. and Rudas, I. J. Information Content Orientated
Product Model Assisted Change Management. In
Proceedings of the 5th International Symposium on
[12]
[13]
[14]
[15]
[16]
21
Intelligent Systems and Informatics (SISY 2007) (Subotica,
24-25 Aug. 2007, 2007).
Moser, T., Biffl, S., Sunindyo, W. D. and Winkler, D.
Integrating Production Automation Expert Knowledge
Across Engineering Stakeholder Domains. In Proceedings
of the 4th International Conference on Complex,
Intelligent and Software Intensive Systems (CISIS 2010)
(Krakow, Poland, 2010). Andrzej Frycz Modrzewski
Cracow College.
Noy, N. F., Doan, A. H. and Halevy, A. Y. Semantic
Integration. AI Magazine, 26, 1, 2005, 7-10.
Noy, N. F. and McGuinness, D. Ontology Development
101: A Guide to Creating Your First Ontology. Stanford
Knowledge Systems Laboratory, 2001.
Sunindyo, W. D., Moser, T., Winkler, D. and Biffl, S.
Foundations for Event-Based Process Analysis in
Heterogeneous Software Engineering Environments. In
Proceedings of the 36th Euromicro Conference on
Software Engineering Advanced Applications (SEAA
2010) (Lille, France, 1-3 September 2010, 2010). IEEE
Computer Society.
van der Aalst, W. M. P., Weijters, A. J. M. M. and
Maruster., L. Workflow Mining: Discovering Process
Models from Event Logs. IEEE Transactions on
Knowledge and Data Engineering, 16, 9, 2004, 11281142.
Efficient Analysis and Execution
of Correct and Complete Model Transformations
Based on Triple Graph Grammars
Frank Hermann
Hartmut Ehrig
Ulrike Golas
Department of Theoretical
Computer Science and
Software Technology
Technische Universität Berlin
Berlin, Germany
Department of Theoretical
Computer Science and
Software Technology
Technische Universität Berlin
Berlin, Germany
Department of Theoretical
Computer Science and
Software Technology
Technische Universität Berlin
Berlin, Germany
frank(at)cs.tu-berlin.de
ehrig(at)cs.tu-berlin.de
Fernando Orejas
ugolas(at)cs.tu-berlin.de
Departament de Llenguatges i
Sistemes Informàtics
Universitat Politècnica de
Catalunya
Barcelona, Spain
orejas(at)lsi.upc.edu
ABSTRACT
Keywords
Triple Graph Grammars are a well-established, formal and
intuitive concept for the specification and analysis of bidirectional model transformations. In previous work we have
formalized and analyzed already termination, correctness,
completeness, local confluence and functional behaviour.
In this paper, we show how to improve the efficiency of the
execution and analysis of model transformations in practical applications by using triple rules with negative application conditions (NACs). In addition to specification NACs,
which improve the specification of model transformations,
the generation of filter NACs improves the efficiency of the
execution and the analysis of functional behaviour supported
by critical pair analysis of the tool AGG. We illustrate the
results for the well-known model transformation from class
diagrams to relational database models.
Model Transformation, Triple Graph Grammars, Functional
Behaviour
1. INTRODUCTION
Categories and Subject Descriptors
D.2.1 [Software Engineering]: Requirements/Specifications;
D.2.12 [Software Engineering]: Interoperability;
I.6.5 [Simulation and Modeling]: Model Development Modeling methodologies
General Terms
Theory, Design, Verification
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
MDI 2010, October 5, 2010, Oslo, Norway.
Copyright 2010 ACM 978-1-4503-0292-0/10/10 ...$10.00.
22
Model transformations based on triple graph grammars
(TGGs) have been introduced by Schürr in [19]. Operational
rules are automatically derived from the triple rules and used
to define various bidirectional model transformation and integration tasks that are mainly focused on model-to-model
transformations Since 1994, several extensions of the original TGG definitions have been published [20, 17, 10], and
various kinds of applications have been presented [22, 11,
16]. Besides model transformation TGGs are also applied
for model integration [1] and model synchronization [8] in
order to support model driven interoperability.
For source-to-target model transformations, so-called forward transformations, forward rules are derived which take
the source graph as input and produce a corresponding target graph. Similarly, backward rules are used for target-tosource transformations making the transformation approach
bidirectional. Major properties expected to be fulfilled for
model transformations are termination, correctness, completeness, efficient execution and — for several applications
— functional behaviour. Termination, completeness and
correctness of model transformations have been studied already in [6, 3, 7, 4]. Functional behaviour of model transformations based on triple graph grammars has been analyzed
for triple rules without application conditions in [15] using
forward translation rules that use additional translation attributes for keeping track of the elements that have been
translated so far.
The main aim of this paper is to extend the analysis
techniques for functional behaviour in [15] to the case of
triple rules with negative application conditions (NACs)
and to improve the efficiency of analysis and execution of
TGS
model transformations studied in [3, 4, 7, 15]. For this purpose, we distinguish between specification NACs and filter
NACs. Specification NACs have been introduced already in
[7, 4], where triple rules and corresponding derived source
and forward rules have been extended by NACs in order to
improve the modeling power. Exemplarily, we show that
NACs improve the specification of the model transformation CD2RDBM from class diagrams to relational data base
models presented in [6, 3]. Therefore, we extend the forward translation rules introduced in [15] by corresponding
NACs and show that model transformations based on forward translation rules with NACs are equivalent to model
transformations studied in [7, 4], such that main results concerning termination, correctness and completeness can be
transferred to our new framework (see Thm. 1). In order
to analyze functional behaviour we can use general results
for local confluence of transformation systems with NACs
in [18]. But in order to improve efficiency in the context of
model transformations we introduce so-called filter NACs.
They filter out several misleading branches considered in
the standard analysis of local confluence using critical pairs.
In our second main result (see Thm. 2) we show how to analyze functional behaviour of model transformations based
on forward translation rules by analyzing critical pairs for
forward translation rules with filter NACs. Moreover, we introduce a strong version of functional behaviour, including
model transformation sequences. In our third main result
(see Thm. 3) we characterize strong functional behaviour by
the absence of “significant” critical pairs for the corresponding set of forward translation rules with filter NACs.
In Sec. 2 we introduce model transformations based on
TGGs with specification NACs and show the first main result on termination, correctness, and completeness. In Sec. 3
we introduce forward translation rules with filter NACs and
present our main results on functional and strong functional
behaviour. Based on these main results we discuss in Sec. 4
efficiency aspects of analysis and execution. Related work
and a conclusion are presented in Sections 5 and 6. The
full proofs of the main results are given in [14].
2.
TGC
0..1
0..*
parent
TGT
Class
Table
CT
name: String
1
attrs
1
0..*
src
1
0..*
name: String
1
dest
0..*
0..*
Association
Attribute
cols
0..*
Column
type: String
name: String
1
AC
name: String
is_primary: boolean
1
0..1
0..1
fcols
0..*
fkeys 0..1
pkey
FKey
AFK
name: String
0..*
1
references
type
1
PrimitiveDataType
name: String
Figure 1: Triple type graph for CD2RDBM
ple model transformation from class diagrams to database
models. The source component TG S defines the structure
of class diagrams while in the target component the structure of relational database models is specified. Classes correspond to tables, attributes to columns, and associations
to foreign keys. Throughout the example, originating from
[6], elements are arranged left, center, and right according
to the component types source, correspondence and target.
Morphisms starting at a correspondence part are specified
by dashed arrows. The denoted multiplicity constraints are
ensured by the triple rules in Figs. 3 and 5.
Note that the case study uses attributed triple graphs
based on E-graphs as presented in [6] in the framework of
weak adhesive HLR categories. We refer to [2] for more
details on attributed graphs.
L
tr
R
MODEL TRANSFORMATIONS BASED
ON TRIPLE GRAPH GRAMMARS
WITH NACS
(LS o
trS
sL
trC
(RS o
LC
sR
tL
trT
RC
/ LT )
tR
/ RT )
L
tr
/R
(P O)
m
G
n
t
/H
Figure 2: Triple rule and triple transformation step
Triple rules synchronously build up their source, target
and correspondence graphs, i.e. they are non-deleting. A
triple rule tr (left of Fig. 2) is an injective triple graph
morphism tr = (trS , trC , trT ) : L → R and w.l.o.g. we
assume tr to be an inclusion. Given a triple graph morphism m : L → G, a triple graph transformation (TGT) step
tr,m
G ===⇒ H (right of Fig. 2) from G to a triple graph H is
given by a pushout of triple graphs with comatch n : R → H
and transformation inclusion t : G ֒→ H. A grammar
TGG = (TG, S, TR) consists of a triple type graph TG,
a triple start graph S = ∅ and a set TR of triple rules.
Triple graph grammars [19] are a well-known approach for
bidirectional model transformations. Models are defined as
pairs of source and target graphs, which are connected via
a correspondence graph together with its embeddings into
these graphs. In this section, we review main constructions
and results of model transformations based on [20, 4, 15]
and extend them to the case with NACs.
sG
tG
A triple graph G =(GS ←
−− GC −
−
→ GT ) consists of three
graphs GS , GC , and GT , called source, correspondence,
and target graphs, together with two graph morphisms
sG : GC → GS and tG : GC → GT . A triple graph morphism m = (mS , mC , mT ) : G → H between triple graphs
G and H consists of three graph morphisms mS : GS → HS ,
mC : GC → HC and mT : GT → HT such that mS ◦ sG =
sH ◦ mC and mT ◦ tG = tH ◦ mC . A typed triple graph G
is typed over a triple graph TG by a triple graph morphism
typeG : G → TG.
Example 2. Triple Rules: The triple rules in Fig. 3 are
part of the rules of the grammar TGG for the model transformation CD2RDBM . They are presented in short notation, i.e. left and right hand side of a rule are depicted in
one triple graph. Elements which are created by the rule are
labeled with green ”++” and marked by green line colouring. The rule “Class2Table” synchronously creates a class
with name “n” together with the corresponding table in the
Example 1. Triple Type Graph: Fig. 1 shows the type
graph TG of the triple graph grammar TGG for our exam-
23
Class2Table(n:String)
++
:Class
:CT
name=n
Definition 1. Triple Rules with Negative Application Conditions: Given a triple rule tr = (L → R), a
negative application condition (NAC) (n : L → N ) consists
of a triple graph N and a triple graph morphism n. A NAC
with n = (nS , idLC , idLT ) is called source NAC and a NAC
with n = (idLS , idLC , nT ) is called target NAC.
A match m : L → G is NAC consistent if there is no
injective q : N → G such that q ◦ n = m for each NAC
∗
n
L−
N . A triple transformation G ⇒ H is NAC consistent
→
if all matches are NAC consistent.
++
:Table
++
name=n
Subclass2Table(n:String)
S1:Class
++
:parent
:Class ++
:Table
:CT
++
:CT
name=n
Attr2Column(n:String, t:String)
S1:Class
++
:PrimitiveDataType :attrs
++
name=t
++
++
:Attribute
:type
name=n
C1:
CT
Association2ForeignKey(an:String, cn:String)
T1:Table
:cols
++
:Class
:src ++
:Association ++
++
:Column
name=n
type=t
++
:AC
is_primary=false
:CT
++
:AFK
name = an
:dest
++
:Class
Figure 3: Rules for the model transformation
CD2RDBM , Part 1
:CT
PrimaryAttr2Column(n:String, t:String)
S1:Class
(LS o
∅
/ ∅)
(RS o ∅ / ∅)
source rule trS
(∅ o
∅
/ LT )
trT
(∅ o ∅ / RT )
target rule trT
(RS o
LC
id
sR
trC
/ LT )
tR
/ RT )
:Attribute
trT
:cols
++ ++
++
is_primary=true
(RS o
RC
forward rule trF
NAC2
:pKey
:attrs
++
:Attribute
name=n
Figure 4: Derived operational rules of a TGG
:pKey
:Column
:Column ++
name=n
type=t
++
:AC
is_primary=true
Triple Rule
trS
:attrs
tL
T1:Table
C1:
CT
NAC1
trS ◦sL
++
++
:Column
:Table
:fkeys ++ :cols type = t
++ ++ name = an+“_“+cn
:FKey
:fcols
:Column
:references
++ :pkey type = t
:Table
name = cn
:type
++
++
:PrimitiveDataType
relational database. Accordingly, subclasses are connected
to the tables of its super classes by rule “Subclass2Table”.
Attributes with type “t” are created together with their corresponding columns in the database component via the rule
“Attr2Column”.
name=t
PrimaryAttr2ColumnFT(n:String, t:String)
S1:Class
tr=T
NAC1
:attrs
tr=T
NAC2
:pKey
:Column
:cols
:Attribute
tr=T
is_primary=true
tr_is_primary=T
:attrs
tr=[F)T]
:pKey
++
:Attribute
tr=[F)T]
name=n
tr_name=[F)T]
is_primary=true
tr_is_primary=[F)T]
:type
++
:AC
++
:Column ++
name=n
type=t
Forward Translation Rule
From each triple rule tr we derive a source rule tr S for the
construction resp. parsing of a model of the source language
and a forward rule trF for forward transformation sequences
(see Fig. 4). By TR S and TR F we denote the sets of all
source and forward rules derived from the set of triple rules
TR. Analogously, we derive a target rule trT and a backward
rule tr B for the construction and transformation of a model
of the target language leading to the sets TR T and TR B .
A set of triple rules TR and the start graph ∅ generate a visual language VL of integrated models, i.e. models with elements in the source, target and correspondence
component. The source language V LS and target language
VLT are derived by projection to the triple components,
i.e. V LS = projS (V L) and V LT = projT (V L). The
set V LS0 of models that can be generated resp. parsed
by the set of all source rules TR S is possibly larger than
∗
VLS and we have VLS ⊆ VLS0 = {GS | ∅ ⇒
= (GS ←
∅ → ∅) via TR S }. Analogously, we have V LT ⊆ V LT 0 =
∗
{GT | ∅ ⇒
= (∅ ← ∅ → GT ) via TR T }.
According to [7, 4] we present negative application conditions for triple rules. In most case studies of model transformations source-target NACs, i.e. either source or target
NACs, are sufficient and we regard them as the standard
case. They prohibit the existence of certain structures either in the source or in the target part only, while general
NACs may prohibit both at once.
T1:Table
C1:
CT
tr=[F)T]
:PrimitiveDataType
tr=[F)T]
name=t
tr_name=[F)T]
Figure 5: Rules for the model transformation
CD2RDBM , Part 2
Example 3. Triple Rules with NACs: Figure 5 shows
the remaining two triple rules for the model transformation
“CD2RDBM ” and additionally a derived forward translation
rule as explained in Ex. 4. NACs are specified in short notation using the label “NAC” with a frame and red line colour
24
within the frame. A complete NAC is obtained by composing the left hand side of a rule with the red marked elements
within the NAC-frame. The rule “Association2ForeignKey”
creates an association between two classes and the corresponding foreign key and the NAC ensures that there is only
one primary key at the destination table. The parameters
“an” and “cn” are used to set the names of the association
and column nodes. The rule “PrimaryAttr2Column” extends
“Attr2Column” by creating additionally a link of type “pkey”
for the column and by setting “is primary=true”. Furthermore, there is a source and a target NAC, which ensure that
there is no primary attribute nor column currently present.
The extension of forward rules to forward translation
rules is based on additional attributes, called translation
attributes, that control the translation process by keeping
track of the elements which have been translated so far.
While in this paper the translation attributes are inserted
in the source models they can be kept separate as an external pointer structure in order to keep the source model
unchanged as shown in Sec. 5 of [13].
Moreover, for each NAC n : L → N of tr we define a forward
translation NAC nF T : LF T → NF T of trF T as inclusion
with NF T = (LF T +L N ) ⊕ AttT
NS \LS .
Remark 1. Note that (LFT +L N ) is the union of LFT and
N with shared L and for a target NAC n the forward translation NAC nF T does not contain any translation attributes
because NS = LS .
Example 4. Forward Translation Rule with NACs:
Fig 5 shows in its lower part the forward translation rule
with NACs “PrimaryAttr2ColumnFT ”. According to Def. 3
the source elements of the triple rule “PrimaryAttr2Column”
are extended by translation attributes and changed by the
rule from “F” to “T”, if the owning elements are created
by the triple rule. Furthermore, the additional elements in
the NAC are extended by translation attributes set to “T”.
Thus, the source NACs concern only elements that have
been translated so far.
From the application point of view model transformation
rules should be applied along matches that are injective on
the structural part. But it would be too restrictive to require injectivity of the matches also on the data and variable
nodes, because we must allow that two different variables are
mapped to the same data value. For this reason we use the
notion of “almost injective matches” [15], which requires that
matches are injective except for the data value nodes. This
way, attribute values can still be specified as terms within a
rule and matched non-injectively to the same value. Next,
we define model transformations based on forward translation rules based on complete forward translation sequences.
Definition 2. Graph with Translation Attributes:
Given an attributed graph AG = (G, D) and a subgraph
G0 ⊆ G we call AG ′ a graph with translation attributes over
AG if it extends AG with one boolean-valued attribute tr x
for each element x (node or edge) in G0 and one booleanvalued attribute tr x a for each attribute associated to such
an element x in G0 . This means that we have a partition
of the items (nodes, edges, or attributes) of G0 into I1 and
F
F
T
I2 s.t. AG ′ = AG ⊕ Att T
I1 ⊕ Att I2 , where Att I1 and Att I2
denotes the translation attributes with value T for I1 and
value F for I2 . Moreover, we define Attv (AG) := AG⊕ AttvG
for v ∈ {T, F}. In any case we require that there is at most
one translation attribute tr x or tr x a for each item.
Definition 4. Completely Translated Graphs and
Complete Sequences: A forward translation sequence
tr ∗
G0 ==FT
=⇒ Gn with almost injective matches is called complete if Gn is completely translated, i.e. all translation attributes of Gn are set to true (“ T”).
The new concept of forward translation rules as introduced in [15] extends the construction of forward rules by
additional translation attributes in the source component.
The translation attributes keep track of the elements that
have been translated so far, which ensures that each element
in the source graph is not translated twice. The rules are
deleting on the translation attributes and thus, the triple
transformations are extended from a single (total) pushout
to the classical double pushout (DPO) approach [2]. We call
these rules forward translation rules, because pure forward
rules need to be controlled by additional control conditions,
such as the source consistency condition in [6, 4].
Definition 5. Model Transformation Based on Forward Translation Rules: A model transformation setr ∗
quence (GS , G0 ==FT
=⇒ Gn , GT ) based on forward translation rules with NACs consists of a source graph GS , a tartr ∗
Definition 3. Forward Translation Rules with NACs:
Given a triple rule tr = (L → R), the forward translation
l
r
rule of tr is given by tr F T = (LFT ←
−FT
−
− KFT −
−FT
−
→ RFT )
tr F
defined as follows using the forward rule (LF −
−−
→ RF ) and
tr S
the source rule (LS −
−−
→ RS ) of tr , where we assume w.l.o.g.
that tr is an inclusion:
• LFT
=
F
LF ⊕ Att T
LS ⊕ Att RS \LS
• KFT
=
LF ⊕ Att T
LS
• RFT
=
=
T
RF ⊕ Att T
LS ⊕ Att RS \LS
T
RF ⊕ Att RS ,
get graph GT , and a complete TGT-sequence G0 ==FT
=⇒ Gn
with almost injective matches, G0 = (Att F (GS ) ← ∅ → ∅)
and Gn = (Att T (GS ) ← GC → GT ).
A model transformation MT : VLS0 ⇛ VLT 0 based on forward translation rules with NACs is defined by all model
transformation sequences as above with GS ∈ VLS0 and
GT ∈ VLT 0 . All these pairs (GS , GT ) define the model
transformation relation MTR ⊆ VLS0 × VLT 0 . The model
transformation is terminating if there are no infinite TGTsequences via forward translation rules and almost injective
matches starting with G0 = (Att F (GS ) ← ∅ → ∅) for some
source graph GS .
Now, we are able to state our first main result concerning
termination, correctness and completeness of model transformations.
Theorem 1. Termination, Correctness and Completeness:
Each model transformation MT : VLS0 ⇛
VLT 0 based on forward translation rules is
• lFT and rFT are the induced inclusions.
25
• terminating, if each forward translation rule changes
at least one translation attribute from “F” to “T”,
G0
S1:Class
• correct, i.e. for each model transformation sequence
tr ∗
(GS , G0 ==FT
=⇒ Gn , GT ) there is G ∈ VL with G =
(GS ← GC → GT ), and it is
:Table
S3:Class
tr=F
name=n
tr_name=F
)
• complete, i.e. for each GS ∈ V LS there is G = (GS ←
GC → GT ) ∈ VL with a model transformation setr ∗
quence (GS , G0 ==FT
=⇒ Gn , GT ).
!
Proof Idea. The proof (see [14]) is based on a corresponding result in [15] for the case without NACs and a Fact
showing the equivalence of (1) source and NAC-consistent
TGT-sequences based on forward rules and (2) complete
NAC-consistent TGT-sequences based on forward translation rules.
G
S1:Class
:CT
:Table
tr=T
S2:parent
tr=F
S3:Class
tr=T
name=n
tr_name=T
Applying a rule according to the DPO approach involves
the check of the gluing condition in general. However, in
the case of forward translation rules and almost injective
matches we have that the gluing condition is always satisfied. This means that the condition does not have to
be checked, which simplifies the analysis of functional behaviour in Sec. 3.
:CT
:Table
name=n
Class2Table
Figure 6: Step G0 ========FT
=⇒ G with misleading
graph G
w.r.t. the model transformation relation is preserved. Filter NACs are based on the following notion of misleading
graphs, which can be seen as model fragments that are responsible for the backtracking of a model transformation.
Fact 1. Gluing Condition for Forward Translation
Rules: Let tr FT be a forward translation rule and mFT :
LFT → G be an almost injective match, then the gluing
condition is satisfied, i.e. there is the transformation step
tr FT ,mFT
G ===
====⇒ H.
Definition 7. Translatable and Misleading Graphs:
A triple graph with translation attributes G is translatable
∗
if there is a transformation G ⇒ H such that H is completely translated. A triple graph with translation attributes
G is misleading, if every triple graph G′ with translation attributes and G′ ⊇ G is not translatable.
Proof Idea. Since only attribution edges are deleted
there are no dangling points and almost injective matching ensures that there are no identification points (see [14]
for full proof).
3.
:CT
tr=T
S2:parent
tr=F
ANALYSIS OF FUNCTIONAL
BEHAVIOUR
Example 5. Misleading Graph: Consider the transformation step shown in Fig. 6. The resulting graph G is
misleading according to Def. 7, because the edge S2 is labeled with a translation attribute set to “F”, but there is
no rule which may change this attribute in any larger context at any later stage of the transformation. The only rule
which changes the translation attribute of a “parent”-edge
is “Subclass2TableFT ”, but it requires that the source node
“S3” is labeled with a translation attribute set to “F”. However, forward translation rules do not modify translation
attributes if they are set to “T” already and additionally do
not change the structure of the source component.
Functional behaviour of a model transformation means
that each model of the source language LS ⊆ VLS is transformed into a unique model of the target language. This section presents new techniques especially developed to show
functional behaviour of correct and complete model transformations based on TGGs.
Definition 6. Functional Behaviour of Model Transformations: A model transformation MT based on forward translation rules has functional behaviour if each execution of MT starting at a source model GS of the source language LS ⊆ VLS leads to a unique target model GT ∈ VLT .
The execution of MT requires backtracking, if there are ter-
Definition 8. Filter NAC: A filter NAC n for a forward
translation rule tr FT : LFT → RFT is given by a morphism
tr FT ,n
n : LFT → N , such that there is a TGT step N ===
==⇒ M
with M being misleading. The extension of tr FT by some set
of filter NACs is called forward translation rule tr FN with
filter NACs.
tr ∗
′
minating TGT-sequences (Att F (GS ) ← ∅ → ∅) ==FT
=⇒ Gn
′
S
T
with Gn 6= Att (GS ).
The standard way to analyze functional behaviour is to
check whether the underlying transformation system is confluent, i.e. all diverging derivation paths starting at the same
model finally meet again. In the context of model transformations, confluence only needs to be ensured for transformation paths which lead to completely translated models. For
this reason, we introduce so-called filter NACs that extend
the model transformation rules in order to avoid misleading paths that cause backtracking. The overall behaviour
Example 6. Forward Translation Rule with Filter
NACs: The rule in Fig. 7 extends the rule Class2Table FT
by a filter NAC obtained from graph G0 of the transformaClass2Table
tion step G0 ========FT
=⇒ G in Fig. 6, where G is misleading according to Ex. 5. In Ex. 7 we extend the rule by a
further similar filter NAC with “tr = T” for node “S2”.
26
transformation step for any given source model GS . The full
proof is given in [14].
NAC
:parent
S2:Class
tr=F
S1:Class
tr=F
name=n
tr_name=F
A direct construction of filter NACs according to Def. 8
would be inefficient, because the size of the considered
graphs to be checked is unbounded. For this reason we now
present efficient techniques which support the generation of
filter NACs and we can bound the size without losing generality. At first we present a static technique for a subset
of filter NACs and thereafter, a dynamic generation technique leading to a much larger set of filter NACs. The first
procedure in Fact 2 below is based on a sufficient criteria
for checking the misleading property. Concerning our example this static generation leads to the filter NAC shown in
Fig. 7 for the rule Class2TableFT for an incoming edge of
type “parent”.
The following dynamic technique for deriving relevant filter NACs is based on the generation of critical pairs, which
define conflicts of rule applications in a minimal context.
By the completeness of critical pairs (Lemma 6.22 in [2]) we
know that for each pair of two parallel dependent transformation steps there is a critical pair which can be embedded.
For this reason, the generation of critical pairs can be used
to derive filter NACs. A critical pair either directly specifies
a filter NAC or a conflict that may lead to non-functional
behaviour of the model transformation.
For the dynamic generation of filter NACs we use the
tool AGG [23] for the generation of critical pairs for a
plain graph transformation system. For this purpose, we
first perform the flattening construction for triple graph
grammars presented in [3, 15] extended to NACs using
the flattening construction for morphisms. A critical pair
tr 2,FT
tr
P1 ⇐=1,FT
=== K ====⇒ P2 consists of a pair of parallel dependent transformation steps. If a critical pair contains a
misleading graph P1 we can use the overlapping graph K
as a filter NAC of the rule tr 1,FT . However, checking the
misleading property needs human assistance, such that the
generated critical pairs can be seen as filter NAC candidates. But we are currently working on a technique that
uses a sufficient criteria to check the misleading property
automatically and we are confident that this approach will
provide a powerful generation technique.
Fact 2. Static Generation of Filter NACs: Given a
triple graph grammar, then the following procedure applied
to each triple rule tr ∈ TR generates filter NACs for the
derived forward translation rules TR FT leading to forward
translation rules TR FN with filter NACs:
Fact 3. Dynamic Generation of Filter NACs:
Given a set of forward translation rules, then generate the
tr 1,FT ,m1
tr 2,FT ,m2
set of critical pairs P1 ⇐==
===== K =======⇒ P2 . If P1 (or
similarly P2 ) is misleading, we generate a new filter NAC
m1 : L1,FT → K for tr 1,FT leading to tr 1,FN , such that
tr=T
RHS
LHS
S1:Class
S1:Class
tr=F
name=n
tr_name=F
)
tr=T
name=n
tr_name=T
:CT
:Table
Figure 7: A forward translation rule with filter
NAC: Class2TableFN
tr
1,FN
K =
====
⇒ P1 violates the filter NAC. Hence, the critical
pair for tr 1,FT and tr 2,FT is no longer a critical pair for
for tr 1,FN and tr 2,FT . But this construction may lead to
new critical pairs for the forward translation rules with filter NACs. The procedure is repeated until no further filter
NAC can be found or validated. This construction starting
with TR FT always terminates, if the structural part of each
graph of a rule is finite.
• Outgoing Edges: Check the following conditions
– tr creates a node (x : Tx ) in the source component
and the type graph allows outgoing edges of type
“Te ” for nodes of type “Tx ”, but tr does not create
an edge (e : Te ) with source node x.
– Each rule in TR which creates an edge (e : Te )
also creates its source node.
– Extend LFT to N by adding an outgoing edge
(e : Te ) at x together with a target node. Add a
translation attribute for e with value F. The inclusion n : LFT → N is a NAC-consistent match
for tr .
For each node x of tr fulfilling the above conditions,
the filter NAC (n : LFT → N ) is generated for tr FT
leading to tr FN .
Proof. The constructed NACs are filter NACs, because
tr 1,FT ,m1
the transformation step K ===
====⇒ P1 contains the misleading graph P1 . The procedure terminates, because the
critical pairs are bounded by the amount of possible pairwise overlappings of the left hand sides of the rules. The
amount of overlappings can be bounded by considering only
constants and variables as possible attribute values.
For our case study the dynamic generation terminates already after the second round, which is typical for practical
applications, because the amount of already translated elements in the new critical pairs usually decreases. Furthermore, the amount of NACs can be reduced by combining
similar NACs differing only on some translation attributes.
The remaining critical pairs that do not specify filter NACs
show effective conflicts between transformation rules and
they can be provided to the developer of the model transformation to support the design phase.
The filter NACs introduced in this paper on the one hand
support the analysis of functional behaviour and on the
• Incoming Edges: Dual case, this time for an incoming
edge (e : Te ).
• TR FN is the extension of TR FT by all filter NACs
constructed above.
Proof Idea. Each generated NAC (n : LFT → N ) for
a node x in tr with an outgoing (incoming) for an edge
tr FT ,n
e in N \ L defines a transformation step N ===
==⇒ M ,
where edge e is still labeled with “F”, but x is labeled with
“T”.By the structure of forward translation rules it follows
that edge e cannot be labeled with “T” at any later model
27
other hand, they also improve the efficiency of the execution. By definition, the occurrence of a filter NAC at an
intermediate model means that the application of the owning rule would lead to a model that cannot be translated
completely, i.e. the execution of the model transformation
would perform backtracking at a later step. This way, a
filter NAC cuts off possible backtracking paths of the model
transformation. As presented in Fact 2 some filter NACs
can be generated automatically and using Fact 3 a larger
set of them can be obtained based on the generation of critical pairs. Finally, by Thms. 2 and 3 we can completely
avoid backtracking if TR FN has no significant critical pair
or, alternatively, if all critical pairs are strictly confluent.
As shown by Fact 4 below, filter NACs do not change the
behaviour of model transformations. The only effect is that
they filter out derivation paths, which would lead to misleading graphs, i.e. to backtracking for the computation of
the model transformation sequence. This means that the
filter NACs filter out backtracking paths. This equivalence
is used on the one hand for the analysis of functional behaviour in Thms. 2 and 3 and furthermore, for improving
the efficiency of the execution of model transformations as
explained in Sec. 4.
If the set of generated critical pairs of a system of forward translation rules with filter NACs TR FN is empty, we
can directly conclude from Thm. 2 that the corresponding
system with forward translation rules TR FT has functional
behaviour. From an efficiency point of view, model transformations should be based on a compact set of rules, because
large rule sets usually involve more attempts of matching
until finding a valid match. In the optimal case, the rule
set ensures that each transformation sequence of the model
transformation is itself unique up to switch equivalence. For
this reason, we introduce the notion of strong functional behaviour.
Definition 9. Strong Functional Behaviour of Model
Transformations: A model transformation based on forward translation rules TR FN with filter NACs has strong
functional behaviour if for each GS ∈ LS ⊆ VLS there
is a GT ∈ VLT and a model transformation sequence
tr ∗
(GS , G0 ==FN
=⇒ Gn , GT ) and each two terminating TGTtr ∗
tr ∗
′
′
′
sequences G′0 ==FN
=⇒ Gm are switch=⇒ Gn and G0 ==FN
equivalent up to isomorphism.
Remark 3.
1. The sequences are terminating means
that no rule in TR FN is applicable any more, but it
is not required that the sequences are complete, i.e.
′
that G′n and Gm are completely translated.
Fact 4. Equivalence of Transformations with Filter NACs:
Given a triple graph grammar TGG =
(TG, ∅, TR) and a triple graph G0 = (GS ← ∅ → ∅)
typed over TG. Let G′0 = (Att F (GS ) ← ∅ → ∅). Then,
the following are equivalent for almost injective matches:
2. Strong functional behaviour implies functional be′
haviour, because G′n and Gm completely translated
tr ∗
tr ∗
′
tr ∗ ,m∗
′
′
implies that G′0 ==FN
=⇒ Gm are ter=⇒ Gn and G0 ==FN
minating TGT-sequences.
tr ∗ ,m∗
3. Two sequences t1 : G0 ⇒∗ G1 and t2 : G0 ⇒∗ G2 are
called switch-equivalent, written t1 ≈ t2, if G1 = G2
and t2 can be obtained from t1 by switching sequential independent steps according to the Local Church
Rosser Theorem with NACs [18]. The sequences t1 and
t2 are called switch-equivalent up to isomorphism if t1 :
G0 ⇒∗ G1 has an isomorphic sequence t1′ : G0 → G2
∼
(using the same sequence of rules) with i : G1 −
−
→ G2 ,
written t1′ = i ◦ t1, such that t1′ ≈ t2. This means
especially that the rule sequence in t2 is a permutation
of that in t1.
′
FT
1. ∃ a complete TGT-sequence G′0 ===
===FT
=⇒ G via
forward translation rules.
′
FN
2. ∃ a complete TGT-sequence G′0 ===
===FT
=⇒ G via
forward translation rules with filter NACs.
Proof Idea. Sequence 1 consists of the same derivation
diagrams as Sequence 2. The additional filter NACs in sequence 2 prevent a transformation rule to create a misleading graph. Both sequences lead to completely translated
models, such that we know that the matches in sequence 1
also fulfill the filter NACs of the rules in sequence 2. The
full proof is given in [14].
Theorem 2. Functional Behaviour:
Let MT be a
model transformation based on forward translation rules
TR FT and let TR FN extend TR FT with filter NACs such
that TR FN is terminating and all critical pairs are strictly
confluent. Then, MT has functional behaviour. Moreover,
the model transformation MT ′ based on TR FN does not require backtracking and defines the same model transformation relation, i.e. MTR ′ = MTR.
The third main result of this paper shows that strong functional behaviour of model transformations based on forward
translation rules with filter NACs can be completely characterized by the absence of “significant” critical pairs.
Definition 10. Significant Critical Pair:
A critical
tr 1,FN
tr 2,FN
pair P1 ⇐
====
= K =
====
⇒ P2 for TR FN is called significant, if it can be embedded into a parallel dependent pair
tr 1,FN
′
′ tr 2,FN
G′1 ⇐
====
⇒ G2 such that there is GS ∈ VLS
====
= G =
Remark 2. TR FN is terminating, if TR FT is terminating
and a sufficient condition is given in Thm. 1. Termination of
TR FN with strict confluence of critical pairs implies unique
normal forms by the Local Confluence Theorem in [18].
tr ∗
′
′
F
and G′0 ==FN
=⇒ G with G0 = (Att (GS ) ← ∅ → ∅).
Proof Idea. The proof (see [14]) is based on a decomposition theorem of triple rule sequences into match-consistent
TGT-sequences based on source and forward rules with
NACs in [7]. The latter are equivalent to complete TGTsequences based on forward translation rules without NACs
in [15] and with NACs in Fact 1 in [14]. Finally, by Fact 4
complete TGT-sequences via forward translation rules with
and without filter NACs are equivalent.
G′0
∗
tr 1,FN c -5
′
c
+3 G′ c[c[c[c[c[c[c [[ G1
[ )1 ′
G
tr
2,FN
2
Theorem 3. Strong Functional Behaviour:
A
model transformation based on terminating forward translation rules TR FN with filter NACs has strong functional
behaviour and does not require backtracking iff TR FN has
no significant critical pair.
28
and efficiently by checking (RS \LS ) 6= ∅. In Thm. 1 we have
given an explicit condition for the forward translation rules
to be terminating.
Functional Behaviour: The new concept of filter NACs introduced in this paper provides a powerful basis for reducing
the analysis efforts w.r.t. functional behaviour. Once termination is shown as explained above, functional behaviour
of model transformations based on forward translation rules
TR FT can be checked by generating the critical pairs of the
transformation system with AGG [23] and showing strict
confluence. The static and dynamic generation of filter
NACs (Facts 2 and 3) allows to eliminate critical pairs. In
the best case, all critical pairs disappear showing the functional behaviour of the model transformation immediately.
The new notion of strong functional behaviour of a system
based on transformation rules TR FN with filter NACs is
completely characterized by the absence of “significant” critical pairs, such that we can ensure for each source model that
the transformation sequence is unique up to switch equivalence. Furthermore, the critical pairs generated by AGG can
be used to find the conflicts between the rules which may
cause non-functional behaviour of the model transformation.
The modeler can decide whether to change the rules or to
keep the non-functional behaviour.
Proof Idea. The proof (see [14]) is based on that of
Thm. 2 and the fact that in the absence of critical pairs two
terminating sequences with the same source can be shown
to be switch-equivalent up to isomorphism using the Local
Church-Rosser and Parallelism Thm. with NACs in [18].
GS
1:Class
name=“Company“
6:src
:CT
7:fkeys
8:Association
name = “employee“
11:dest
14:Class
name=“Person“
16:parent
22:attrs
18:Class
name=“Customer“
3:Table
name=“Company“
:AFK
:CT
:CT
10:FKey
4:cols
GT
5:Column
type = “int“
name = “employee_cust_id“
12:fcols
13:references
17:Table
name=“Person“
20:cols
21:pkey
23:Attribute
is_primary = true
name=“cust_id“
23:type
AC
25:Column
type = “int“
name = “cust_id“
27:PrimitiveDataType
name = “int“
Figure 8: Triple graph instance
Example 7. Functional Behaviour: We analyze functional behaviour of the model transformation CD2RDBM
with triple rulesTR given in Figs. 3 and 5. First of all,
CD2RDBM is terminating according to Thm. 1. For analyzing the local confluence we can use the tool AGG [23] for
the generation of critical pairs. We use the extended rule
Class2TableFN as shown in Fig. 7 and extend it by a further filter NAC obtained by the static generation acc. to
Fact 2. AGG detects two critical pairs showing a conflict of
the rule “PrimaryAttr2Column” with itself for an overlapping graph with two primary attributes. Both critical pairs
lead to additional filter NACs by the dynamic generation of
filter NACs in Fact 3 leading to a system of forward translation rules with filter NACs without any critical pair. Thus,
we can apply Thm. 3 and show that the model transformation based on the forward translation rules with filter NACs
TR FN has strong functional behaviour and does not require
backtracking. Furthermore, by Thm. 2 we can conclude that
the model transformation based on the forward translation
rules TR FT without filter NACs has functional behaviour
and does not require backtracking. As an example, Fig. 8
shows the resulting triple graph (translation attributes are
omitted) of a model transformation starting with the class
diagram GS .
Model Size
Model Transformation Sequences of CD2RDBM
without Filter NACs
with Filter NACs
Time 1)
[Elements2)]
11
25
53
109
[ms]
143.75
302.75
672.68
1,481.43
Success Rate Time 1) Overhead Success Rate
[%]
42.86
16.84
3.94
0.17
[ms]
158.33
335.45
742.62
1,584.86
[%]
10.14
10.80
10.40
6.98
[%]
100.00
100.00
100.00
100.00
1) Average time of 100 successful model transformation sequences
2) Nodes and Edges
Table 1: Benchmark, Tool: AGG [23]
Efficient Execution: Filter NACs do not only improve the
analysis of functional behaviour of a TGG, but also the execution of the model transformation process by forbidding the
application of misleading transformation steps that would
lead to a dead-end eliminating the need of backtracking for
these cases. Table 1 shows execution times using the transformation engine AGG [23]. The additional overhead caused
by filter NACs is fairly small and lies in the area of 10% for
the examples in the benchmark, which is based on the average execution times for 100 executions concerning models
with 11, 25, 53 and 109 elements (nodes and edges), respectively. The first model with 11 elements is the presented
class diagram in the source component of Fig. 8. We explicitly do not compare the execution times of the system
with filter NACs with one particular system with backtracking, because these times can vary heavily depending on the
used techniques for partial order reduction and the chosen
examples. Instead we present the computed success rates
for the system without NACs which show that backtracking will cause a substantial overhead in any case. Thus,
the listed times concern successful execution paths only, i.e.
those executions that lead to a completely translated model.
The success rate for transformations without filter NACs decreases fast when considering larger models. Times for the
4. EFFICIENT ANALYSIS AND EXECUTION
Our approach to model transformations based on triple
graph grammars (TGGs) with NACs will be discussed now
with respect to the efficiency for both, analysis of properties
and execution.
Correctness and Completeness: As shown by Thm. 1
based on [7, 4] model transformations based on TGGs with
NACs are correct and complete with respect to the language
of integrated models VL generated by the triple rules. Thus,
correctness and completeness are ensured by construction.
Termination: As presented in [4] termination is essentially
ensured, if all triple rules are creating on the source component. This property can be checked statically, automatically
29
unsuccessful executions, which appear in the system without
filter NACs, are not considered. However, in order to ensure
completeness there is the need for backtracking for the system without filter NACs. This backtracking overhead is in
general exponential and in our case study misleading graphs
appear already at the beginning of many transformation sequences implying that backtracking is costly. Backtracking
is reduced by filter NACs and avoided completely in the
case that no “significant critical pair” remains present (see
Thm. 3), which we have shown to be fulfilled for our example. The additional overhead of about 10% for filter NACs is
in most cases much smaller than the efforts for backtracking.
Moreover, in order to perform model transformations
using highly optimized transformation machines for plain
graph transformation, such as Fujaba and GrGen.Net [21],
we have presented how the transformation rules and models
can be equivalently represented by plain graphs and rules.
First of all, triple graphs and morphisms are flattened according to the construction presented in [3, 15], which can be
extended to NACs using the flattening of morphisms. Furthermore, we presented in this paper how forward rules with
NACs are extended to forward translation rules with NACs,
such that the control condition “source consistency” [6] and
also the gluing condition (Fact 1) are ensured automatically
for complete sequences, i.e. they do not need to be checked
during the transformation.
Summing up, the presented results allow us to combine
the easy, intuitive and formally well founded specification
of model transformations based on triple graph grammars
with NACs with the best available tools for executing graph
transformations while still ensuring correctness and completeness.
In the following we discuss how the presented results can
be used to meet the “Grand Research Challenge of the TGG
Community” formulated by Schürr et.al. in [20]. The main
aims are “Consistency”, “Completeness”, “Expressiveness”
and “Efficiency” of model transformations. The first two effectively require correctness, completeness w.r.t. the triple
language VL and additionally termination and functional
behaviour. They are ensured as shown in Sec. 3. While we
considered functional behaviour w.r.t. unique target models,
the more general notion in [20] regarding some semantical
equivalence of target models will be part of further extensions of our techniques. “Expressiveness” requires suitable
control mechanisms like NACs, which are used extensively
in this paper and we further extend the technique by additional control mechanisms. In [9] more general application
conditions [12] are considered, but functional behaviour is
not yet analyzed. In general, the overall usage of complex
control structures should be kept low, because they may
cause complex computations. Finally, we discussed in Sec. 4
that our approach can be executed efficiently based on efficient graph transformation engines. Especially model transformations fulfilling the conditions in Thm. 3 do not need
to backtrack, which bounds the number of transformation
steps to the elements in the source model as required in [20].
6. CONCLUSION
In this paper we have studied model transformations
based on triple graph grammars (TGGs) with negative application conditions (NACs) in order to improve efficiency of
analysis and execution compared with previous approaches
in the literature. The first key idea is that model transformations can be constructed by applying forward translation
rules with NACs, which can be derived automatically from
the given TGG-rules with NACs. The first main result shows
termination under weak assumptions, correctness and completeness of model transformations in this framework, which
is equivalent to the approach in [7]. The second key idea is to
introduce filter NACs in addition to the NACs in the given
TGG-rules, which in contrast are called specification NACs
in this paper. Filter NACs are useful to improve the analysis of functional behaviour for model transformations based
on critical pair analysis (using the tool AGG [23]) by filtering out backtracking paths and this way, some critical pairs.
The second main result provides a sufficient condition for
functional behaviour based on the analysis of critical pairs
for forward translation rules with filter NACs. If we are
able to construct filter NACs such that the corresponding
rules have no more “significant” critical pairs, then the third
main result shows that we have strong functional behaviour,
i.e. not only the results are unique up to isomorphism but
also the corresponding model transformation sequences are
switch-equivalent up to isomorphism. Surprisingly, we can
show that the condition “no significant critical pairs” is not
only sufficient, but also necessary for strong functional behaviour. Finally, we discuss efficiency aspects of analysis and
execution of model transformations and show that our sample model transformation CD2RDBM based on TGG-rules
with NACs has strong functional behaviour.
The main challenge in applying our main results on functional and strong functional behaviour is to find suitable
filter NACs, such that we have a minimal number of critical pairs for the forward translation rules with filter NACs.
For this purpose, we provide static and dynamic techniques
5. RELATED WORK
Since 1994, several extensions of the original TGG definitions [19] have been published [20, 17, 10] and various kinds
of applications have been presented [22, 11, 16]. The formal
construction and analysis of model transformations based
on TGGs has been started in [6] by analyzing information
preservation of bidirectional model transformations and continued in [3, 5, 4, 7, 15], where model transformations based
on TGGs are compared with those on plain graph grammars in [3], TGGs with specification NACs are analyzed in
[7] and an efficient on-the-fly construction is introduced in
[4]. A first approach analyzing functional behaviour was presented for restricted TGGs with distinguished kernels in [5]
and a more general approach, however without NACs, based
on forward translation rules in [15]. The results in this paper for model transformations based on forward translation
rules with specification and filter NACs are based on the
results of all these papers except of [5].
In [6] a similar case study based on forward rules is presented, but without using NACs. This causes that more
TGT-sequences are possible, in particular, an association
can be transformed into a foreign key with one primary
key, even if there is a second primary attribute that will
be transformed into a second primary key at a later stage.
This behaviour is not desired from the application point of
view. Thus, the grammar with NACs in this paper handles
primary keys and foreign keys in a more appropriate way.
Furthermore, the system has strong functional behaviour as
shown in Sec. 3.
30
for the generation of filter NACs (see Facts 2 and 3). The
dynamic technique includes a check that certain models are
misleading. In any case, the designer of the model transformation can specify some filter NACs directly by himself, if
he can ensure the filter NAC property. Furthermore, we can
avoid backtracking completely by Thms. 2 and 3 if TR FN
has no significant critical pair or, alternatively, if all critical
pairs are strictly confluent.
In future work, we will study further static conditions to
check whether a model is “misleading”, because this allows to
filter out misleading execution paths. In addition to that, we
currently develop extensions to layered model transformations and amalgamated rules, which allow to further reduce
backtracking in general cases and to simplify the underlying
rule sets. Moreover, we study applications to model transformations that partially relate two DSLs, were some node
types are irrelevant for the model transformation.
[11]
[12]
[13]
[14]
7. REFERENCES
[1] Ehrig, H., Ehrig, K., Hermann, F.: From Model
Transformation to Model Integration based on the
Algebraic Approach to Triple Graph Grammars. In:
Ermel, C., de Lara, J., Heckel, R. (eds.) Proc.
GT-VMT’08. EC-EASST, vol. 10. EASST (2008)
[2] Ehrig, H., Ehrig, K., Prange, U., Taentzer, G.:
Fundamentals of Algebraic Graph Transformation.
EATCS Monographs, Springer (2006)
[3] Ehrig, H., Ermel, C., Hermann, F.: On the
Relationship of Model Transformations Based on
Triple and Plain Graph Grammars. In: Karsai, G.,
Taentzer, G. (eds.) Proc. GraMoT’08. ACM (2008)
[4] Ehrig, H., Ermel, C., Hermann, F., Prange, U.:
On-the-Fly Construction, Correctness and
Completeness of Model Transformations based on
Triple Graph Grammars. In: Schürr, A., Selic, B.
(eds.) Proc. ACM/IEEE MODELS’09. LNCS, vol.
5795, pp. 241–255. Springer (2009)
[5] Ehrig, H., Prange, U.: Formal Analysis of Model
Transformations Based on Triple Graph Rules with
Kernels. In: Ehrig, H., Heckel, R., Rozenberg, G.,
Taentzer, G. (eds.) Proc. ICGT’08. LNCS, vol. 5214,
pp. 178–193. Springer (2008)
[6] Ehrig, H., Ehrig, K., Ermel, C., Hermann, F.,
Taentzer, G.: Information preserving bidirectional
model transformations. In: Dwyer, M.B., Lopes, A.
(eds.) Proc. FASE’07. LNCS, vol. 4422, pp. 72–86.
Springer (2007)
[7] Ehrig, H., Hermann, F., Sartorius, C.: Completeness
and Correctness of Model Transformations based on
Triple Graph Grammars with Negative Application
Conditions. In: Heckel, R., Boronat, A. (eds.) Proc.
GT-VMT’09. EC-EASST, vol. 18. EASST (2009)
[8] Giese, H., Wagner, R.: From model transformation to
incremental bidirectional model synchronization.
Software and Systems Modeling 8(1), 21–43 (2009)
[9] Golas, U., Ehrig, H., Hermann, F.: Enhancing the
Expressiveness of Formal Specifications for Model
Transformations by Triple Graph Grammars with
Application Conditions. In: Proc. Int. Workshop on
Graph Computation Models (GCM’10) (2010)
[10] Guerra, E., de Lara, J.: Attributed typed triple graph
transformation with inheritance in the double pushout
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
31
approach. Tech. Rep. UC3M-TR-CS-2006-00,
Universidad Carlos III, Madrid, Spain (2006)
Guerra, E., de Lara, J.: Model view management with
triple graph grammars. In: Corradini, A., Ehrig, H.,
Montanari, U., Ribeiro, L., Rozenberg, G. (eds.) Proc.
ICGT’06. LNCS, vol. 4178, pp. 351–366. Springer
(2006)
Habel, A., Pennemann, K.H.: Correctness of
high-level transformation systems relative to nested
conditions. Mathematical Structures in Computer
Science 19, 1–52 (2009)
Hermann, F., Ehrig, H., Golas, U., Orejas, F.: Formal
Analysis of Functional Behaviour for Model
Transformations Based on Triple Graph Grammars Extended Version. Tech. Rep. 2010-8, TU Berlin, Fak.
IV (2010)
Hermann, F., Ehrig, H., Golas, U., Orejas, F.:
Efficient Analysis and Execution of Correct and
Complete Model Transformations Based on Triple
Graph Grammars - Extended Version. Tech. Rep.
2010-13, TU Berlin, Fak. IV (2010)
Hermann, F., Ehrig, H., Orejas, F., Golas, U.: Formal
Analysis of Functional Behaviour of Model
Transformations Based on Triple Graph Grammars.
In: Proc. Int. Conf. on Graph Transformation
(ICGT’10). LNCS, vol. 6372, pp. 155–170. Springer
(2010)
Kindler, E., Wagner, R.: Triple graph grammars:
Concepts, extensions, implementations, and
application scenarios. Tech. Rep. TR-ri-07-284,
Department of Computer Science, University of
Paderborn, Germany (2007)
Königs, A., Schürr, A.: Tool Integration with Triple
Graph Grammars - A Survey. In: Proc. SegraVis
School on Foundations of Visual Modelling
Techniques. ENTCS, vol. 148, pp. 113–150. Elsevier
Science (2006)
Lambers, L.: Certifying Rule-Based Models using
Graph Transformation. Ph.D. thesis, Technische
Universität Berlin (November 2009)
Schürr, A.: Specification of Graph Translators with
Triple Graph Grammars. In: Tinhofer, G. (ed.) Proc.
WG’94. LNCS, vol. 903, pp. 151–163. Springer (1994)
Schürr, A., Klar, F.: 15 years of triple graph
grammars. In: Ehrig, H., Heckel, R., Rozenberg, G.,
Taentzer, G. (eds.) Proc. ICGT’08. pp. 411–425.
LNCS, Springer (2008)
Taentzer, G., Biermann, E., Bisztray, D., Bohnet, B.,
Boneva, I., Boronat, A., Geiger, L., Geis̈, R., Horvath,
A., Kniemeyer, O., Mens, T., Ness, B., Plump, D.,
Vajk, T.: Generation of Sierpinski Triangles: A Case
Study for Graph Transformation Tools. In: Schürr, A.,
Nagl, M., Zündorf, A. (eds.) Proc. AGTIVE’07.
LNCS, vol. 5088, pp. 514 – 539. Springer (2008)
Taentzer, G., Ehrig, K., Guerra, E., de Lara, J.,
Lengyel, L., Levendovsky, T., Prange, U., Varro, D.,
Varro-Gyapay, S.: Model Transformation by Graph
Transformation: A Comparative Study. In: Proc.
MoDELS 2005 Workshop MTiP’05 (2005)
TFS-Group, TU Berlin: AGG (2009),
http://tfs.cs.tu-berlin.de/agg
Towards an Expressivity Benchmark for Mappings based
on a Systematic Classification of Heterogeneities∗
M. Wimmer
G. Kappel
TU Vienna
TU Vienna
wimmer@big.tuwien.ac.at
W. Retschitzegger
JKU Linz
werner@bioinf.jku.at
A. Kusel
gerti@big.tuwien.ac.at
J. Schoenboeck
TU Vienna
schoenboeck@bioinf.jku.at
ABSTRACT
JKU Linz
kusel@bioinf.jku.at
W. Schwinger
JKU Linz
wieland@jku.at
modeling tools is available supporting different tasks, such
as model creation, model simulation, model checking, model
transformation, and code generation. Seamless exchange of
models among different modeling tools increasingly becomes
a crucial prerequisite for effective MDE. Due to the lack of
interoperability, however, it is often difficult to use tools
in combination, thus the potential of MDE cannot be fully
exploited. For achieving interoperability in terms of transparent model exchange, current best practices comprise creating model transformations between different tool metamodels (MMs) with the main drawback of having to deal
with all the intricacies of a certain transformation language.
In contrast to that, first mapping tools [6, 18] have been
proposed, allowing to specify a transformation on a more
abstract level by means of reusable components. Out of
the resulting mapping definitions corresponding executable
transformation code can be generated. In the definition of
a mapping between MMs the resolution of heterogeneities
represents the key challenge. Thereby heterogeneities result
from the fact that semantically similar metamodeling concepts (M2) can be defined with different meta-metamodeling
concepts (M3) leading to differently structured metamodels. As a simple example Fig. 1 shows two metamodels of
fictitious1 domain-specific tools administrating publications.
Whereas the MM of Tool1 models the type of a publication
by the attribute Publication.kind (e.g., conference, workshop or journal), the MM of Tool2 represents the same semantic using the class Publication which refers to a class
Kind to determine the kind of the publication.
A crucial prerequisite for the success of Model Driven Engineering (MDE) is the seamless exchange of models between
different modeling tools demanding for mappings between
tool-specific metamodels. Thereby the resolution of heterogeneities between these tool-specific metamodels is a ubiquitous problem representing the key challenge. Nevertheless,
there is no comprehensive classification of potential heterogeneities available in the domain of MDE. This hinders the
specification of a comprehensive benchmark explicating requirements wrt. expressivity of mapping tools, which provide reusable components for resolving these heterogeneities.
Therefore, we propose a feature-based classification of heterogeneities, which accordingly adapts and extends existing
classifications. This feature-based classification builds the
basis for a mapping benchmark, thereby providing a comprehensive set of requirements concerning expressivity of dedicated mapping tools. In this paper a first set of benchmark
examples is presented by means of metamodels and conforming models acting as an evaluation suite for mapping tools.
Categories and Subject Descriptors
D.2.12 [Software Engineering]: Interoperability
General Terms
Measurement
Keywords
Classification of Heterogeneities, Mapping Benchmark
Publication
1. INTRODUCTION
name:String
kind:Integer
With the rise of MDE models become the main artifacts of
the software development process [3]. Hence, a multitude of
MM of Tool1
∗This work has been funded by the Austrian Science Fund
(FWF) under grant P21374-N13.
Publication
name:String
kind
1..1
Kind
name:String
MM of Tool2
Figure 1: Two Heterogeneous Tool Metamodels
In order to resolve such heterogeneities mapping tools provide certain reusable components. Nevertheless, it is still
unclear, which kinds of reusable components are required
to provide the necessary expressivity. Therefore this paper
provides a systematic classification of heterogeneities occurring in the domain of MDE between object-oriented MMs,
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
MDI2010, October 5, 2010, Oslo, Norway.
Copyright 2010 ACM 978-1-4503-0292-0/10/10 ...$10.00.
1
Due to reasons of comprehensibility examples comprising
ontological concepts have been preferred over examples comprising linguistic concepts.
32
thereby adapting and extending existing classifications [2, 4,
10, 11, 12, 13, 15, 17]. Moreover, this classification is used
to derive an evaluation suite building an expressivity benchmark for mapping tools. Thereby a first set of examples is
presented in this paper. Additional heterogeneity examples
can be downloaded from our homepage2 complementing the
expressivity benchmark.
The remainder of this paper is structured as follows. In
Section 2 we present the design rationale behind our classification as well as the feature-based classification itself. In the
Sections 3-5 we exemplary discuss heterogeneities, thereby
presenting six examples of our expressivity benchmark. Related work is discussed in Section 6 and finally, Section 7 concludes the paper together with an outlook on future work.
In this respect, Fig. 3 depicts the relevant extract of the
Ecore meta-metamodel for mappings. When comparing two
Ecore-based metamodels, different cases can be distinguished,
namely (i) that in the left-hand side (LHS) MM and in the
right-hand side (RHS) MM the same Ecore concept is used.
Thereby differences wrt. the owned attribute settings can
arise, e.g., if two EClasses are used, one can be set abstract
whereas the other is not – leading to a concreteness difference. Moreover, (ii) in the LHS MM and in the RHS MM
different Ecore concepts may be used, e.g., an EAttribute
in the LHS MM and an EReference, an EClass and an EAttribute in the RHS MM (cf. example in Fig. 2). Finally,
(iii) both cases mentioned get more complex, if the number
of Ecore concepts for modeling a certain MM concept differs. A simple example in this respect is that in one MM
two EAttributes firstName and lastName are used whereas
in the other MM this information is contained in just one
EAttribute name.
2. TOWARDS A SYSTEMATIC CLASSIFICATION OF HETEROGENEITIES
This section presents the design rationale behind the proposed classification of heterogeneities as well as the classification itself. Since the classification targets at the domain
of MDE, it bases on object-oriented MMs in contrast to existing classifications from the domain of data engineering
basing either on the relational or the XML data model. To
clearly make explicit the interconnections between heterogeneities we build our classification on a feature model [5].
ENamedElement
Naming Difference
name : String
ETypedElement
ordered : boolean
lowerBound : int
upperBound : int
Order Difference
Multiplicity
Difference
Concreteness
Difference
Inheritance Type
Difference
EClassifier
…
Breadth Difference
Depth Difference
eSu
uperTypes
0..**
EClass
abstract : boolean
2.1
Deriving Heterogeneities from Ecore
Context Difference
Con
ncrette
Syyntaxx
name:String
kind:Integer
name:String
g
name = ‘name‘
lowerBound = 1
upperBound = 1
eAttributeType
name = ‘Publication‘
abstract = false
name = ‘kin
kind‘
d
ordered = fa
alse
lowerBound
d=1
upperBound
d=1
containment = false
eStructuralFeatures
EAttribute
eStructuralFeatures
name = ‘kind‘
‘ki d‘
lowerBound = 1
upperBound
B
d=1
eAttributeType
eAttributeType
EString
EReference
e
name = ‘Publication‘
Publication
abstract = false
eAttributeType
eReferenceType
EInteger
EAttribute
EClass
name = ‘Kind‘
Kind
abstract = false
eStructuralFeatures
s
Containment
C
i
Difference
EAtt ib t
EAttribute
Figure 3: Variation Points in Ecore-based MMs
Kind
EClass
Datatype Difference
…
name:String
g
name = ‘nam
me‘
lowerBound
d=1
upperBound
d=1
EReference
containment: boolean
Constraint Difference
EAttribute
EString
0..*
…
eStructuralFeatures
EClass
Instaance of Eccore ((Absttract Synttax)
kind
1..1
Publication
EAttribute
eStructuralFeatures
eStructuralFeatures
EStructuralFeature
MM of Tool2
2
Publication
1..1
eAttributeType
yp
1..1
eReferenceType
R f
T
Direection Difference
Heterogeneities result from the fact that semantically similar concepts can be defined with different metamodeling
concepts (e.g., Ecore3 ) leading to differently structured tool
metamodels. To exemplify this, Fig. 2 depicts the MMs of
Fig. 1 as Ecore instances. Thereby, several heterogeneities
arise, e.g., the MM of Tool1 represents the publication kind
by an EAttribute whereas the MM of Tool2 utilizes an
EReference, an EClass and an EAttribute to represent the
semantically equivalent information.
MM of Tool1
EDataType
…
name = ‘name‘
lowerBound = 1
upperBound = 1
Figure 2: Tool Metamodels as Instances of Ecore
To gain a systematic classification of different kinds of
syntactic heterogeneities, we investigated potential variation
points between two Ecore-based metamodels (cf. Fig. 3).
Ecore has been used since it is the prevalent meta-metamodel
in MDE and since it comprises the core concepts of semantic data models [9], being classes, attributes, references and
inheritance. Therefore, the proposed classification can also
be applied to other data models comprising these common
core concepts, e.g., OWL4 .
2
www.modeltransformation.net
http://www.eclipse.org/modeling/emf/
4
http://www.w3.org/TR/owl-features/
3
33
Besides syntactic heterogeneities, comprising all heterogeneities that can be derived from the syntactic definition
in Ecore, also semantic heterogeneities may arise [15]. They
occur when the valid instance set differs – either (i) in the
number of valid instances or (ii) in the interpretation of the
instance values. An example for the first case is that one MM
comprises an EClass Publication whereas the other MM
comprises an EClass JournalPublication, allowing only for
journal instances - thus being a subset of the valid instances
of the EClass Publication. An example for the second case
is that one MM comprises an EAttribute amount encoding
pricing information in Dollar, whereas the other MM also
exhibits an EAttribute amount but encoding the pricing information in Euro. Thus, semantic heterogeneities can not
be derived from the syntax (since in both cases the MMs
can be represented syntactically equal) but only by incorporating interpretation, i.e., an assignment of a meaning to
each piece of data [8].
2.2
Classification of Heterogeneities
Based on this design rationale, we introduce a classification of heterogeneities. It is expressed using the feature
model formalism [5], which allows to clearly point out the interconnections between the different kinds of heterogeneities
(e.g., xor features modeling mutual exclusive features versus or features allowing to pick several features at once).
Thereby heterogeneities are divided into the two main classes
of (i) semantic heterogeneities, i.e., heterogeneities wrt. what
Heterogeneity
g
y
Required Feature
XOR Features
Optional Feature
OR Features
Legend
Semantic Heterogeneity
Number of
Instances Difference
Syntactic Heterogeneity
Interpretation of
Instance Values Difference
Naming Difference
Structural Difference
I h it
Inheritance
Diff
Difference
C
Core
C
Concept
Diff
Difference
Intersection
Subset
Superset
Disjoint
Source-Target-Concept
Cardinality
1:1
Context
Difference
n:1
1:n
m:n
Same Metamodeling Concept
C2C
A2A
R2R
Different Metamodeling Concept
C2A
A2C
R2A
2R
Reference
Source
Context
Difference
O de
Order
Difference
R2C
Same Metamodeling Concept
I2I
Re
eference
Target
I2C
I2R
C2I
A2I
R2I
Breadth
Difference
Multiplicty
Difference
Datatype
Difference
iff
I2A
Concreteness
Difference
O d
Order
Difference
Multiplicity
Difference
Different Metamodeling Concept
C
A
R
C
A
Depth
Difference
Diff
R
Inheritance Type
Difference
Direction
Difference
iff
Containment
Difference
Constraint
Difference
Constraint
nce
Differen
Figure 4: Heterogeneity Feature Model
is represented by a MM and (ii) syntactic heterogeneities,
i.e., heterogeneities wrt. how it is represented (cf. Fig. 4)
whereby these two classes might occur jointly as modeled by
the or relationship in between.
Semantic Heterogeneities. Concerning semantic heterogeneities – as mentioned above – two main cases can be
distinguished namely (i) differences in the number of valid
instances and (ii) differences in the interpretation of the instance values. With respect to the first case all the settheoretic relationships might occur as modeled by the corresponding sub-features. Regarding the second case diverse
modifications of the values might be necessary to translate
the values of one MM to correct values of the other MM such
that it conforms to the interpretation of the other MM.
Syntactic Heterogeneities. With respect to syntactic
heterogeneities we distinguish between simple naming differences (i.e., a difference in the value of the name attribute of
ENamedElement – cf. Fig. 3) and more challenging structural
differences. Although names play an important role when
deriving the semantics of a certain concept, names do not
allow to automatically conclude on the semantics. Thereby,
the two cases (i) same semantic and different naming, i.e.,
synonyms and (ii) different semantic and same naming, i.e.,
homonyms can be distinguished.
With respect to structural differences again two main cases
can be distinguished – namely core concept differences and
inheritance differences. Thereby, core concept differences
are differences that occur due to the different usage of classes,
attributes and references between two MMs. In addition,
these two main categories can be further distinguished into
same metamodeling concept heterogeneities and different metamodeling concept heterogeneities, differentiating whether the
same Ecore concepts have been used in the LHS MM and
in the RHS MM or not. In the context of core concept
differences additionally a different number of concepts may
have been used in the two MMs leading to different sourcetarget-concept cardinalities. In the following sections a first
set of benchmark examples is given divided into three main
34
packages, comprising (i) core concept heterogeneities with
same metamodeling concept heterogeneities, (ii) core concept heterogeneities with different metamodeling concept
heterogeneities and (iii) inheritance heterogeneities. Due
to space limitations only a subset of all potential heterogeneities is explained in detail by means of concrete metamodels and according model instances but nevertheless examples from each main category are given. In this respect,
the benchmark examples are described uniformly comprising
(i) a short description, (ii) the main challenges, (iii) the example description, and (iv) a discussion of resolution strategies. Complementary benchmark examples are presented on
our collaborative homepage which invites the community to
participate in adding and discussing benchmark examples.
3. CORE CONCEPT HETEROGENEITIES
– SAME CONCEPTS
Same metamodeling concept heterogeneities are heterogeneities, that occur although the same modeling concept
has been used in the LHS MM as well as in the RHS MM
as mentioned above. In this respect, two main differences
might emerge – either the concepts exhibit different attribute
settings (cf. Fig 3) or a different number of concepts has
been used in the MMs to express the same semantic concept
(cf. Source-Target-Concept Cardinality in Fig. 4). In the
following two examples of this category are given.
3.1
Benchmark Example 1
This first example (cf. Fig. 5) only exhibits differences
wrt. different attribute settings (cf. optional features of
A(ttribute)2A(ttribute) and R(eference)2R(eference) in Figure 4) as well as semantic heterogeneities. The main challenges in this example can be summarized as follows:
1. EAttribute Professor.dateOfBirth –
EAttribute Prof.bornIn:
A2A, Multiplicity Difference, Datatype Difference
Con
ncrette
Syyntaxx
Professor
Prof
name:String
dateOfBirth:Date [0..1]
salary:Integer
Currency =
D ll
Dollar
publications
0..*
EClass
Publication
eStructuralFeatures
Absttract Synttax
EAttribute
name = ‘name‘
lowerBound = 1
upperBound = 1
Challenge 1
EString
eStructuralFeatures
St t lF t
EDataType
C
Challenge
2
name = ‘Date‘
‘D t ‘
eStructuralFeatures
Challenge 3
EReference
f
eStructuralFeatures
name = ‘bornIn‘
lowerBound = 1
upperBound = 1
eAttributeType
name = ‘salary‘
‘ l ‘
lowerBound = 1
upperBound
B
d=1
eAttributeType
EI t
EInteger
EReference
f
1:1, R2R, Naming Differen
nce, Multiplicity Difference
p
name = ‘publications‘
ordered = false
lowerBound = 0
upperBound = -1
1
containment = false
eStructuralFeatures
Ch ll
Challenge
4
eReferenceType
EClass
EString
EAttribute
1:1, A2A, Semanttic Heterogeneity
eAttributeType
EI t
EInteger
eAttributeType
EAttribute
tt bute
1:1 A2A
1:1,
A2A, Naming Difference,
Difference Multiplicity Difference
Difference, Datatype Difference
EAttribute
eStructuralFeatures
eStructuralFeatures
eAttributeType
name = ‘dateOfBirth‘
eAttributeType
lowerBound = 0
upperBound = 1
name = ‘salary‘
‘ l ‘
lowerBound = 1
upperBound
B
d=1
name = ‘Prof‘
abstract = false
1:1, A2A (no heterogeneity)
EAttribute
tt bute
Journal
name:String
EClass
1:1 C2C
1:1,
C2C, Naming Difference
name = ‘Professor‘ eStructuralFeatures
EAttribute
abstract
b t t = false
f l
name = ‘name‘
lowerBound = 1
upperBound = 1
journals
1..*
name:String
bornIn:Integer [1
[1..1]
1]
salary:Integer
Currency =
Euro
name:String
type:String
name = ‘journals‘
j
ordered = false
lowerBound = 1
upperBound = -1
containment = false
eReferenceType
EClass
1:1, C2C, Naming Difference, Semantic Heterogeneity
name = ‘Publication‘
abstract = false
name = ‘Journal‘
abstract = false
eStructuralFeatures
EAtt ib t
EAttribute
eStructuralFeatures
name = ‘name‘
lowerBound = 1
upperBound = 1
EAtt ib t
EAttribute
1:1, A2A (no heeterogeneitiy)
name = ‘name‘
lowerBound = 1
upperBound = 1
eAttributeType
eAttributeType
eAttributeType
EAttribute
Exxamp
ple In
nstan
nces
eStructuralFeatures
1:0, Information Loss
name = ‘type‘
lowerBound = 1
upperBound = 1
P1:Professor
P1:Prof
P2:Prof
name = ‘Prof1‘
dateOfBirth = 12.04.1956
12 04 1956
salary = 5000
name = ‘Prof1‘
bornIn = 1956
salary = 3970
name = ‘Prof2‘
b I 2000
bornIn=
salary = 2382
publications
P10:Publication
name = ‘Paper1‘
Paper1
type = ‘Conference‘
publications
P11:Publication
P2:Professor
name = ‘Prof2‘
‘P f2‘
dateOfBirth = ‘‘
salary
l
= 3000
journals
name = ‘Paper2‘
Paper2
type = ‘Journal‘
journals
P11:Journal
P0:Journal
name = ‘Paper2‘
name = ‘TODO‘
Autogenerated
or userinteraction
necessary
Autogenerated
or userinteraction
necessary
Figure 5: Benchmark Example 1 – Same Metamodeling Concept Heterogeneities
2. EAttribute Professor.salary –
EAttribute Prof.salary: Semantic Heterogeneity (Interpretation of Instance Values Difference), A2A
3. EReference Professor.publications –
EReference Prof.journals:
R2R, Multiplicity Difference
4. EClass Publication – EClass Journal: Semantic Heterogeneity (Number of Instances Difference), C2C
Example Description. This first benchmark example
(cf. Fig. 5) exhibits four main challenges. With respect
to the first challenge, a multiplicity difference as well as a
datatype difference between the EAttributes Professor.dateOfBirth and Prof.bornIn arise. Concerning the second challenge a semantic heterogeneity between the EAttributes Professor.salary and Prof.salary emerges since
Professor.salary is encoded in Dollars whereas Prof.salary is encoded in Euros, i.e., difference in the interpretation of the values. Regarding the third challenge a multiplicity difference between the EReferences Professor.publications and Prof.journals exists. Finally, the fourth challenge incorporates again a semantic heterogeneity – but this
time a difference in the number of valid instances. For resolving the differences of the first three challenges corresponding
functions are required which either are able to generate values or to transform values. In contrast to that, for resolving
35
the heterogeneity of the fourth challenge a corresponding
condition is needed, that filters those instances, that are
still valid in the context of the RHS EClass.
Discussion of Resolution Strategies. When taking a
look at the example instances, one can see that a resolution strategy has been chosen to minimize information loss
and to achieve valid instances only. This is since instance
P2 has been kept in the RHS although it does not reference any journal publication in the LHS model. Another
potential resolution strategy would be to keep only those
Professor instances that actually exhibit a journal publication. If this is the case, also a semantic heterogeneity between the EClasses Professor and Prof would exist, since
the valid instance set would be potentially different. Another interesting point in this example is that the RHS MM
is more restrictive than the LHS MM since the EAttribute
Prof.bornIn always requires a value and since each instance
of Prof requires at least one link to a journal publication.
Since these restrictions do not exist in the LHS MM, instances of the LHS MM may not fulfill them. Therefore some
resolution strategy is needed – either by auto-generating values or by incorporating user-interaction in order to produce
valid instances of the RHS MM.
3.2
Benchmark Example 2
In contrast to the first example which restricts itself to
source-target-concept cardinalities of 1:1, this example (cf.
Concrete
Syyntaxx
kind
1..1
Publication
title:String
subtitle:String
eStructuralFeatures
Publication
Ki d
Kind
name:String
ki d St i
kind:String
name:String
g
EClass
EAttribute
name = ‘Publication‘
abstract = false
name = ‘title‘
lowerBound = 1
upperBound = 1
Challenge 1
eAttributeType
EAttribute
n:1, A2A, Naming
eStructuralFeatures name = ‘name‘
Difference
lowerBound = 1
eStructuralFeatures
eAttributeType
pp
=1
upperBound
EString
EAttribute
Ab
bstracct Syntax
name = ‘subtitle‘
lowerBound = 1
upperBound = 1
eAttributeType
yp
Challenge 2
EClass
EReference
eStructuralFeatures
EString
name = ‘Publication‘
abstract = false
n:1, C2
2C
name = ‘kind‘
ordered = false
lowerBound = 1
upperBound = 1
containment = false
eReferenceType
y
EClass
name = ‘Kind‘
abstract
b t t = false
f l
Chaallenge 3
eStructuralFeatures
Exxamp
ple Instances
name = ‘name‘
lowerBound = 1
upperBound
B
d=1
eStructuralFeatures
EAttribute
EAttribute
1:1, A2A, Naming Difference,
D
Context Difference
name = ‘kind‘
lowerBound = 1
pp
=1
upperBound
eAttributeType
P1: Publication
P2: Publication
P3: Publication
PK1:Publication
title = ‘P1‘
subtitle = ‘S1‘
title = ‘P2‘
subtitle = ‘S2‘
title = ‘P3‘
subtitle = ‘S3‘
name = ‘P1 – S1‘
kind = ‘Journal‘
kind
kind
kind
PK2:Publication
K1: Kind
K2: Kind
name = ‘Journal‘
‘J
l‘
name = ‘Conference‘
‘C f
‘
eAttributeType
Att ib t T
PK3:Publication
name = ‘P3 – S3‘
kind = ‘Conference‘
name = ‘P2 – S2‘
kind = ‘Journal‘
Figure 6: Benchmark Example 2 – Same Metamodeling Concept Heterogeneities
strategies could be followed, whereby in this case a simple
concatenation has been chosen. Other strategies comprise
another concatenation order. In case of other datatypes
(e.g., numbers) arbitrary calculations could be incorporated.
Fig. 6) additionally contains differences wrt. the number of
concepts (cf. Source-Target-Concept Cardinality in Fig. 4).
The main challenges in this example can be summarized as
follows:
1. EAttribute Publication.title,
EAttribute Publication.subtitle –
EAttribute Publication.name:
Source-Target-Concept Cardinality: n:1, A2A
4. CORE CONCEPT HETEROGENEITIES
– DIFFERENT CONCEPTS
Different metamodeling concept heterogeneities result from
expressing the same semantic concept with different modeling concepts in the LHS MM and in the RHS MM. In
our classification, potential heterogeneities were derived by
systematically combining the identified core concepts of semantic data models. To exemplify these heterogeneities two
benchmark examples are discussed in the following.
2. EClass Publication, EClass Kind –
EClass Publication:
Source-Target-Concept Cardinality: n:1, C2C
3. EAttribute Kind.name –
EAttribute Publication.kind:
A2A, Context Difference
4.1
Example Description. This benchmark example (cf.
Fig. 6) possesses three challenges. Concerning the first challenge, there is a n:1 source-target-concept cardinality between the EAttributes title, subtitle and name. In order to resolve this heterogeneity, merging functionality is
needed, which is basically a concatenation function in this
case. Concerning the second challenge, again a n:1 sourcetarget-concept cardinality exists, but this time between the
EClasses Publication, Kind and Publication. Therefore,
again merging functionality is needed, allowing to merge objects under a certain condition. Finally, the third challenge
consists in a context difference between the EAttributes
Kind.name and Publication.kind. For its resolution the
assignment of values across object boundaries is needed.
Discussion of Resolution Strategies. When taking a
look at the example instances in Fig. 6, one can see, that
for each combination of a Publication object and the referenced Kind object a Publication object should be generated. Concerning the merge of the attributes different
36
Benchmark Example 3
The third example (cf. Fig. 7) deals with the fact that
a concept is modeled in the LHS MM by means of an EAttribute whereas the RHS MM models this concept explicitly by means of an EClass. Thus, the main challenges in
this example can be summarized as follows:
1. EAttribute Publication.kind – EClass Kind: A2C
2. EClass Publication, EAttribute Publication.kind
– EReference Publication.kind: CA2R
Example Description. The first challenge is that the
kind of the publication is represented by means of the EAttribute Publication.kind in the LHS MM whereas the
RHS MM makes the type explicit by means of the EClass
Kind, which is therefore classified as A(ttribute)2C(lass)
in Fig. 7. In order to link publications with the publication kind, the RHS MM provides the EReference Publication.kind for which there is no according counterpart in
Con
ncrete
Syyntax
‐
Publication
eStructuralFeatures
kind
1..1
Publication
name:String
kind:String
title:String
Kind
unique
name:String
eStructuralFeatures
EAttribute
EClass
name = ‘name‘
lowerBound = 1
upperBound = 1
name = ‘Publication‘
abstract = false
EAttribute
name = ‘title‘
lowerBound = 1
pp
=1
upperBound
1:1, A2A,, Naming Difference
eAttributeType
A ract SSyntaax
Abst
eAttributeType
eAttributeType
EString
Challenge 2
EString
EReference
name = ‘kind‘
ordered = false
lowerBound = 1
upperBound = 1
containment = false
1:1,, CA2R,, Namingg
Difference
EClass
name = ‘Publication‘
abstract = false
eStructuralFeatures
eAttributeType
eReferenceType
EClass
C
name = ‘Kind‘
abstract = false
eStructuralFeatures
EAttribute
Exam
mple Instaancees
eStructuralFeatures
EAttribute
name = ‘kind‘
lowerBound = 1
upperBound = 1
Challenge 1
P1 P bli ti
P1:Publication
name = ‘P1‘
kind = ‘Journal‘
P3:Publication
P2:Publication
name = ‘P3‘
P3
kind = ‘Conference‘
name = ‘name‘
o e ou d = 1
lowerBound
upperBound = 1
1:1, A2A, Naming Difference,
D
Context Difference
P1: Publication
P2: Publication
P3: Publication
title = ‘P1‘
P1
title = ‘P2‘
P2
title = ‘P3‘
P3
kind
name = ‘P2‘
P2
kind = ‘Journal‘
kind
kind
K1: Kind
K2: Kind
name = ‘Journal‘
name = ‘Conference‘
Figure 7: Benchmark Example 3 – Different Metamodeling Concept Heterogeneities (A2C, CA2R)
4. EReference Professor.publications,
EClass Publication
– EReference DBLPEntry.publication: RC2R
the LHS MM, i.e., the RHS links have to be generated, representing the second challenge in the example. In order to
establish such additional links in the RHS, the information
is needed in which relation the to be linked concepts have
been in the LHS MM. With respect to this example, the
source of the EReference Publication.kind is represented
in the LHS MM by means of the EClass Publication and
the target of the EReference by means of the EAttribute
Publication.kind. Therefore, this heterogeneity is classified as C(lass)A(ttribute)2R(eference), whereby the first
letter depicts the used LHS concept for the source of the to
be generated reference and the second letter the used LHS
concept for the target of the to be generated reference.
Discussion of Resolution Strategies. When taking a
look at the example instances, one can see that the desired
intention of an A2C heterogeneity is that only for distinct
Publication.kind attribute values an according Kind object should be generated. Therefore, the RHS model exhibits only a single object named Journal (cf. K1 in Fig. 7),
which is referenced by the Publication objects P1 and P2.
4.2
Example Description. Whereas the class Professor in
the LHS MM in Fig. 8 has a direct EReference Professor.publications, the LHS MM offers this information only indirectly by means of the EClass DBLPEntry and its EReference DBLPEntry.publication, representing the first challenge in this example (cf.R(eference)2C(lass) feature value
in Fig. 4). Concerning the second challenge, values for the
DBLP.id EAttribute have to be generated. Since the containing RHS EClass is generated on basis of the LHS EReference Professor.publications the according EAttribute
has also to be generated on basis of this EReference (cf.
R(eference)2A(ttribute) feature value in Fig. 4). With respect to the third and fourth challenge, the according links
have to be established. For this again the information is
needed in which relation the to be linked concepts have been
in the LHS MM, as described above. Concerning the Professor.entries EReference, the source of the EReference
(Professor) is generated on basis of the LHS EClass Professor and the target of the EReference (DBLPEntry) on basis of the EReference Professor.publications – thus this
heterogeneity is classified as C(lass)R(eference)2R(eference).
A similar situation occurs for the RHS EReference DBLPEntry.publication but in this case the source of the EReference bases on an EReference and the target bases on an
EClass – a heterogeneity classified as R(eference)C(lass)2R(eference).
Discussion of Resolution Strategies. The challenge
in this benchmark example is to obtain objects conforming to the RHS EClass DLBPEntry (cf. example instances
in Fig. 8). These RHS objects have to created on basis
of the LHS links since these links encode the information
which publications belong to which professor which is also
the task of DBLPEntry objects. Therefore, Fig. 8 depicts
four DLBPEntry objects which originate from the four LHS
Professor.publications links. To set the DBLPEntry.id
Benchmark Example 4
Whereas the previous example exhibited the heterogeneity
that a LHS concept is modeled by means of an EAttribute
and the RHS concept by means of an EClass, the following
example (cf. Fig. 8) exhibits the heterogeneity that a LHS
concept is modeled by means of an EReference whereas the
equivalent RHS concept is again represented by an EClass.
The main challenges in this example are:
1. EReference Professor.publications –
EClass DBLPEntry: R2C
2. EReference Professor.publications –
EAttribute DBLPEntry.id: R2A
3. EClass Professor,
EReference Professor.publications
– EReference Professor.entries: CR2R
37
Concrette
Syyntaxx
P f
Professor
publications
0..*
name:String
g
Publication
Professor
name:String
name:String
eStructuralFeatures
EClass
name = ‘Professor‘
abstract = false
name = ‘name‘
lowerBound = 1
upperBound = 1
eAttributeType
eAttributeType
yp
publication
1..1
Publication
name:String
EAttribute
name = ‘name‘
lowerBound = 1
upperBound = 1
1:1,, A2A (no
( heterogeneity)
g
y)
EAttribute
DBLPEntry
id:Integer
eStructuralFeatures
EClass
1:1, C2C ((no heterogeneity)
g
y)
name = ‘Professor‘
abstract = false
entries
0.*
eAttributeType
EString
g
eAttributeType
Att ib t T
EString
EReference
1:1 CR2R
1:1,
CR2R, Naming
Difference
eStructuralFeatures
Ch llenge 3
Challe
name = ‘entries‘
ordered = false
l
lowerBound
B
d=0
upperBound = -1
containment = false
Ab
bstract Syyntaxx
eReferenceType
Challenge 1
EReference
eStructuralFeatures
St t lF t
name = ‘publications‘
‘ bli ti
‘
ordered = false
lowerBound = 0
upperBound = -1
containment = false
EClass
eStructuralFeatures
name = ‘DBLPEntry‘
‘DBLPE t ‘
abstract = false
1:1, R2C, Naming Difference
eAttributeType
Att ib t T
EAttribute
EInteger
name = ‘id‘
id
lowerBound = 1
upperBound = 1
1:1, R2A, Nam
ming Difference
ff
Challenge 2
EReference
name = ‘publication‘
p
ordered = false
lowerBound = 1
upperBound = 1
containment = false
eStructuralFeatures
eReferenceType
Challenge
e4
1:1, RC2R, Naming
Difference
eReferenceType
EClass
EClass
name = ‘Publication‘
abstract
b t t = false
f l
name = ‘Publication‘
abstract
b t t = false
f l
1:1 C2C (no heterogeneity)
1:1,
eStructuralFeatures
EAttribute
EAttribute
1:1, A2A (no heterogeneity)
name = ‘name‘
lowerBound = 1
upperBound = 1
name = ‘name‘
lowerBound = 1
upperBound = 1
eStructuralFeatures
E mple Instaancess
Exam
P1:Professor
entries
P1:Professor
publications
name = ‘Prof1‘
P2:Professor
entries
name = ‘Prof1‘
entries
name = ‘Prof2‘
entries
P2:Professor
publications
publications
name = ‘Prof2‘
publications
bli ti
P10:Publication
P11:Publication
P12:Publication
name = ‘P1‘
name = ‘P2‘
name = ‘P3‘
D1:DBLPEntry
D2:DBLPEntry
D3:DBLPEntry
D4:DBLPEntry
id = 1
id = 2
id = 3
id = 4
publication
P10:Publication
name = ‘P1‘
publication
publication
P11:Publication
name = ‘P2‘
publication
P12:Publication
name = ‘P3‘
Figure 8: Benchmark Example 4 – Different Metamodeling Concept Heterogeneities (R2C, R2A)
concreteness differences. The main challenges in this example can be summarized as follows:
value a function is needed which generates an according id
whereby again for every LHS link an according RHS value
should be created.
1. EClass FullProf, EClass AssistantProf –
EClass FullProf: I2I, Breadth Difference
5. INHERITANCE HETEROGENEITIES
2. EClass Assistant – EClass Assistant:
I2I, Concreteness Difference, Depth Difference
In the previous sections we discussed potential heterogeneities when considering the metamodeling concepts of
classes, attributes and references. Finally, heterogeneities
might be caused by the concept of inheritance. In this respect we again divide into heterogeneities that might occur
although both MMs use inheritance (cf. same metamodeling concept inheritance differences in Fig. 4) and heterogeneities that occur if only one MM makes use of inheritance (cf. different metamodeling concept inheritance differences in Fig. 4). Similar to the afore mentioned same metamodeling concept differences (cf. Section 3), same metametamodeling concept inheritance differences occur due to
different attribute values or links in the Ecore MMs (cf.
Fig. 3) whereas the latter heterogeneities occur if an inheritance hierarchy in one MM is expressed by other concepts
(i.e., classes, attributes, and references) in the other MM. In
the following one example per category is given.
5.1
3. EClass PrePhd, EClass PostPhd –
No corresponding EClass: I2I, Breadth Difference
Benchmark Example 5
This example (cf. Fig. 9) belongs to the same metametamodeling concept category and therefore both MMs make
use of inheritance. Nevertheless certain heterogeneities occur, comprising breadth differences, depth differences and
38
Example Description. Concerning the first challenge,
a breadth difference between the LHS EClasses FullProf,
AssistantProf and the RHS EClass FullProf exists. This
is since the number of sibling classes in the context of a certain parent class differs. For resolving breadth differences,
the strategy can be applied to map instances of some class
only existing in the LHS MM to a concrete parent class
in the RHS MM. Nevertheless, since the parent classes of
the EClass AssistantProf are abstract, instances of AssistantProf get lost. With respect to the second challenge,
a concreteness difference as well as a depth difference occurs
between the two EClasses Assistant. This is since the
EClass Assistant in the LHS MM is set abstract whereas
the corresponding EClass Assistant in the RHS MM is
concrete. Additionally, a depth difference exists, since the
longest path of subclasses in the context of the EClass Assistant in the LHS MM is 1 whereas it is 0 in the context of
the corresponding class in the RHS MM. For resolving the
ResearchStaff
C crete Synttax
Conc
ResearchStaff
name:String
name:String
Professor
f
FullProf
Assistant
AssistantProf
PrePhd
FullProf
PostPhd
1:1, C2C (no heterogeneity)
EClass
Assistant
Professor
EClass
name = ‘ResearchStaff‘
‘R
hSt ff‘
abstract = true
name = ‘ResearchStaff‘
‘R
hS ff‘
abstract = true
eSuperTypes
eSuperTypes
EAttribute
eStructuralFeatures
eSuperTypes
p yp
name = ‘name‘
l
lowerBound
B
d=1
upperBound = 1
EAttribute
1:1, A2A (no heteerogeneity)
eAttributeType
eAttributeType
EString
eSuperTypes
EClass
EClass
1:1, C2C (no heterogeneity)
name = ‘Professor‘
Professor
abstract = true
name = ‘Professor‘
Professor
abstract = true
eSuperTypes
S
T
eSuperTypes
S
T
Absttract Synttax
eSuperTypes
S
T
name = ‘name‘
l
lowerBound
B
d=1
upperBound = 1
eStructuralFeatures
EString
1:1, C2C
C, breadth difference
EClass
name = ‘FullProf‘
abstract = false
EClass
EClass
name = ‘FullProf‘
abstract = false
Challenge 1
1:0, breadth difference
name = ‘AssistantProf‘
AssistantProf
abstract = false
Ch llenge 2
Chall
EClass
EClass
1:1, C2C, concreteness difference, depth
d
difference
name = ‘Assistant‘
abstract = true
eSuperTypes
name = ‘Assistant‘
abstract = false
eSuperTypes
EClass
1:0, breadth difference
name = ‘PrePhd‘
PrePhd
abstract = false
EClass
Challenge 3
1:0, breadth difference
Exxamp
ple
Insstancces
name = ‘PostPhd‘
abstract = false
F1:FullProf
name = ‘Prof1‘
Prof1
P1:PrePhd
name = ‘PrePhd1‘
PrePhd1
A1:AssistantProf
name = ‘AssProf1‘
P2:PostPhd
name = ‘PostPhd1‘
F1:FullProf
name = ‘Prof1‘
‘P f1‘
P1:Assistant
name = ‘PrePhd1‘
‘P Phd1‘
P2:Assistant
name = ‘PostPhd1‘
Figure 9: Benchmark Example 5 – Same Metamodeling Concept Heterogeneities
5.2
concreteness difference no strategy is needed in this example, since the LHS class is abstract and therefore no instances
can exist. The situation would be different, if it would be
inverse. Then instances might be lost, if no concrete class in
the RHS MM for including those instances might be found.
For resolving the depth difference, the strategy can be pursued to map instances of the classes only existing in the LHS
MM to some concrete parent class in the RHS MM. Therefore, in this case the instances of the EClasses PrePhd and
PostPhd result in instances of the parent EClass Assistant
in the RHS MM. Finally, regarding the third challenge, a
breadth difference between the EClasses PrePhd and PostPhd and the non-existing RHS classes exists. Since in this
case the breadth difference overlaps with the depth difference of challenge 2 (being the case since the EClass Assistant in the RHS MM exhibits no subclasses at all), no
additional resolution strategy is needed here.
Discussion of Resolution Strategies. When taking a
look at the chosen resolution strategies, one can see that a
strategy has been chosen that tries to minimize instance loss
and thus information loss. Therefore instances of a class that
only exist in the LHS MM should be kept by mapping them
to some concrete parent class due to the is-a relationship
between the classes. Nevertheless, the explicit type information and additional features only owned by the subclass
are lost. Therefore sometimes also a strategy that omits
these instances might be useful.
Benchmark Example 6
This example (cf. Fig. 10) belongs to the different metamodeling concept category and therefore only one MM makes
use of inheritance. The main challenge in this example can
be summarized as follows:
1. EAttribute ResearchStaff.kind –
EClasses ResearchStaff, Professor, Assistant
and FullProf in inheritance hierarchy: A2I
Example Description. With respect to the main challenge in this example, an A(ttribute)2I(nheritance) heterogeneity between the EAttribute ResearchStaff.kind and
the EClasses ResearchStaff, Professor, Assistant and
FullProf occurs. For resolving this kind of heterogeneity
a condition is needed to divide the instances of the EClass
ResearchStaff according to the values of the EAttribute
kind in order to instantiate instances of the corresponding
RHS classes. Thereby the problem may arise, that the EAttribute of the LHS MM comprises values that do not correspond to any (concrete) EClass in the RHS MM. This is
the case in the example with the instance R1, since the corresponding EClass Professor in the RHS MM is abstract
and can thus not be instantiated causing information loss.
Discussion of Resolution Strategies. Concerning the
resolution strategy chosen in this example again information
loss should be prevented whenever possible. Nevertheless, as
already discussed above, this may not always be possible.
39
‐
Co
oncreete Syyntaxx
ResearchStaff
name:String
R
ResearchStaff
hSt ff
Assistant
Professor
name:String
g
kind:String
FullProf
EClass
EClass
ECl
name = ‘ResearchStaff‘
ResearchStaff
abstract = false
name = ‘ResearchStaff‘
abstract = true
eSuperTypes
1:1 A2A (no heterogeneity)
1:1,
h
EAttribute
Abstracct Syn
ntax
eStructuralFeatures
name = ‘name‘
l
lowerBound
B
d=1
upperBound = 1
eAttributeType
EAttribute
eStructuralFeatures
EString
name = ‘name‘
l
lowerBound
B
d=1
upperBound = 1
eAttributeType
EString
eAttrib teT pe
eAttributeType
EAttribute
eStructuralFeatures
EClass
name = ‘kind‘
lowerBound = 1
upperBound = 1
name = ‘Professor‘
abstract = true
S
T
eSuperTypes
A2I
EClass
eSuperTypes
name = ‘FullProf‘
abstract = false
Challenge
h ll
1
EClass
Exxample
Insstances
name = ‘Assistant‘
abstract = false
R1:ResearchStaff
R2:ResearchStaff
R3:ResearchStaff
R2:FullProf
R3:Assistant
name = ‘staff1‘
ki d = ‘Professor‘
kind
‘P f
‘
name = ‘staff2‘
ki d = ‘FullProf‘
kind
‘F llP f‘
name = ‘staff3‘
ki d = ‘Assistant‘
kind
‘A i t t‘
name = ‘staff2‘
name = ‘staff3‘
Figure 10: Benchmark Example 6 – Different Metamodeling Concept Heterogeneities (A2I)
6. RELATED WORK
ments are presented, but on a rather coarse-grained level,
e.g., conditional patterns dealing with attribute differences
and transformation patterns, vaguely dealing with different
metamodeling concept heterogeneities. With respect to existing classifications, Visser et al. [17] and Klein [12] provide
a comprehensive list of semantic heterogeneities. Nevertheless, they have a strong focus on semantic heterogeneities,
neglecting syntactic heterogeneities.
Summarizing, although there are several classifications
available, none explicitly focuses on the domain of MDE.
Therefore we systematically analyzed variation points in the
Ecore meta-metamodel in order to extend and adapt existing classifications. In this respect, we aligned on the one
hand terms of existing classifications, e.g., most classifications introduced terms for the heterogeneities summarized
in our classification by same metamodeling concept heterogeneities. On the other hand, we introduced new heterogeneities stemming from the explicit concepts of references
and inheritance in object-oriented metamodels in contrast
to existing classifications basing either on the relational or
the XML data model. Finally, current classifications miss to
explicate how different types of heterogeneities relate to each
other, which we formalized by means of a feature model.
Existing Benchmarks.
Model Engineering. To the best of our knowledge no
benchmark for mapping systems in the area of MDE exists.
Nevertheless, a benchmark for evaluating the performance
of graph transformations [16] has been proposed.
Data Engineering. In the area of data engineering Alexe
et. al. propose in [1] a first benchmark for mapping systems, thereby presenting a basic suite of mapping scenarios
which should be readily supported by any mapping system
focussing on information integration. In this respect, ten
examples are discussed for which the actual transformation
functions are given in terms of XQuery5 expressions. Addi-
In the following, two threads of related work are considered. First, our feature-based classification is compared to
existing classifications. Second, mapping benchmark is related to existing mapping benchmarks. In this respect, at
first the most closely related area of model engineering is
examined. Moreover, the more widely related areas of data
engineering and ontology engineering are investigated.
Existing Classifications.
Model Engineering. Although model transformations and
thus the resolution of heterogeneities between MMs play a
vital role in MDE, to the best of our knowledge no dedicated
survey examining potential heterogeneities exists.
Data Engineering. In contrast to that, in the area of
data engineering a plethora of literature exists for decades
highlighting different aspects of heterogeneities in the context of database schemata. A first classification of semantic
and structural heterogeneities when integrating two different
schemas was presented by Batini et al. in [2]. A systematic
classification of possible variations in a SQL statement was
presented by Kim et al. in [11], detailing Table-Table and
Attribute-Attribute heterogeneities, e.g., wrt. cardinalities.
The classification of Kashyap et al. presented in [10] provides a broad overview on possible heterogeneities in a data
integration scenario comprising semantic heterogeneities and
conflicts occurring between same modeling concepts. The
work of Blaha et al. presented in [4] describes patterns
resolving syntactic heterogeneities, comprising same metamodeling concept heterogeneities as well as different metametamodeling concept heterogeneities. Finally, the classification of Legler [13] presents a systematic approach for
attribute mappings by combining possible attribute correspondences with cardinalities.
Ontology Engineering. Concerning the domain of ontology
engineering pattern collections as well as classifications exist.
A pattern collection has been presented by Scharffe et al. in
[14]. Thereby correspondence patterns for ontology align-
5
40
http://www.w3.org/TR/xquery/
tional examples are presented on their homepage6 . Although
the benchmark provides a first set of mapping scenarios it
remains unclear how the scenarios have been obtained and if
they provide full coverage in terms of expressivity. Although
XQuery expressions are given to define the semantics, some
of the XQuery functions assume the availability of custom
functions which are not provided. Since there are also no
RHS models given it is hard to get the actual outcome of
the transformation. Finally, some scenarios are not clearly
specified with the given query (cf. scenario 2 and 17 on their
homepage). A further benchmark called THALIA is presented by Hammer et. al in [7]. It provides researchers with
a collection of twelve benchmark queries given in XQuery,
focusing on the resolution of syntactic and semantic heterogeneities in a data integration scenario. For every query a socalled reference schema (i.e., global schema) and a challenge
schema is provided (i.e., the schema to be integrated) together with instances. Although the paper claims a systematic classification of semantic and syntactic heterogeneities
leading to the presented queries, it is merely an enumeration
of heterogeneities where the rationale behind is left unclear.
Ontology Engineering. With respect to the area of ontology engineering, no dedicated mapping benchmark exists.
Nevertheless, efforts concerning the evaluation of matching
tools, i.e., tools for automatically discovering alignments between ontologies have been spent, resulting in an ontology
matching benchmark 7 whereby these examples could be of
interest for a dedicated mapping benchmark as well.
Summarizing, although both benchmarks from the area of
data engineering provide useful scenarios in the context of
XML they do not provide a systematic classification resulting in a systematic set of benchmark examples to evaluate
the expressivity of a certain mapping system.
[3] J. Bézivin. On the Unification Power of Models.
Journal on SoSyM, 4(2):31, 2005.
[4] M. Blaha and W. Premerlani. A catalog of object
model transformations. In Proc. of the 3rd Working
Conference on Reverse Engineering, WCRE’96, pages
87–96, 1996.
[5] K. Czarnecki, S. Helsen, and U. Eisenecker. Staged
Configuration Using Feature Models. In Proc. of Third
Software Product Line Conf., pages 266–283, 2004.
[6] M. Del Fabro and P. Valduriez. Towards the efficient
development of model transformations using model
weaving and matching transformations. Journal on
SoSyM, 8(3):305–324, July 2009.
[7] J. Hammer, M. Stonebraker, and O. Topsakal.
THALIA: Test harness for the assessment of legacy
information integration approaches. In Proc. of the
Int. Conf. on Data Engineering, ICDE, pages
485–486, 2005.
[8] D. Harel and B. Rumpe. Meaningful modeling:
What’s the semantics of ”semantics”? Computer,
37:64–72, 2004.
[9] R. Hull and R. King. Semantic Database Modeling:
Survey, Applications, and Research Issues. ACM
Comput. Surv., 19(3):201–260, 1987.
[10] V. Kashyap and A. Sheth. Semantic and schematic
similarities between database objects: A context-based
approach. The VLDB Journal, 5(4):276–304, 1996.
[11] W. Kim and J. Seo. Classifying Schematic and Data
Heterogeneity in Multidatabase Systems. Computer,
24(12):12–18, 1991.
[12] M. Klein. Combining and relating ontologies: an
analysis of problems and solutions. In Proc. of
Workshop on Ontologies and Information Sharing,
IJCAIŠ01, 2001.
[13] F. Legler and F. Naumann. A Classification of Schema
Mappings and Analysis of Mapping Tools. In Proc. of
the GI-Fachtagung für Datenbanksysteme in Business,
Technologie und Web (BTW’07), 2007.
[14] F. Scharffe and D. Fensel. Correspondence Patterns
for Ontology Alignment. In Proc. of the 16th Int.
Conf. on Knowledge Engineering, EKAW ’08, pages
83–92, 2008.
[15] A. P. Sheth and J. A. Larson. Federated Database
Systems for Managing Distributed, Heterogeneous,
and Autonomous Databases. ACM Comput. Surv.,
22(3):183–236, 1990.
[16] G. Varro, A. Schürr, and D. Varro. Benchmarking for
graph transformation. In Proc. of the 2005 IEEE
Symposium on Visual Languages and Human-Centric
Computing, VLHCC ’05, pages 79–88, 2005.
[17] P. R. S. Visser, D. M. Jones, T. J. M. Bench-Capon,
and M. J. R. Shave. An analysis of ontological
mismatches: Heterogeneity versus interoperability. In
Proc. of AAAI 1997 Spring Symposium on Ontological
Engineering, 1997.
[18] M. Wimmer, G. Kappel, A. Kusel, W. Retschitzegger,
J. Schönböck, and W. Schwinger. Surviving the
Heterogeneity Jungle with Composite Mapping
Operators. In Proc. of the 3rd Int. Conf. on Model
Transformation, ICMT 2010, pages 260–275, 2010.
7. CONCLUSION AND FUTURE WORK
In this paper we presented a systematic classification of
heterogeneities occurring between Ecore-based MMs. Nevertheless, this classification of heterogeneities can also be
applied to other semantic data models, comprising the common core concepts this classification bases on. Moreover, a
first set of benchmark examples has been proposed stating
the requirements a mapping tool should fulfill. Additionally,
these benchmark examples can be used to compare solutions
realized with ordinary transformation languages. Further
work comprise the completion of the benchmark examples
to fully cover the classification. However, the success of a
benchmark heavily depends on the agreement of the community – thus our collaborative homepage invites for discussions. Finally, a tool evaluation on basis of this benchmark
is envisioned comparing and evaluating mapping tools from
diverse engineering domains wrt. their expressivity.
8. REFERENCES
[1] B. Alexe, W.-C. Tan, and Y. Velegrakis.
STBenchmark: Towards a Benchmark for Mapping
Systems. VLDB Endow., 1(1):230–244, 2008.
[2] C. Batini, M. Lenzerini, and S. B. Navathe. A
Comparative Analysis of Methodologies for Database
Schema Integration. ACM Comp. Surv.,
18(4):323–364, 1986.
6
7
http://www.stbenchmark.org/
http://oaei.ontologymatching.org/2010/
41
Specifying Overlaps of Heterogeneous Models for Global
Consistency Checking
Zinovy Diskin, Yingfei Xiong, and Krzysztof Czarnecki
Univesity of Waterloo
Waterloo, ON, Canada
{zdiskin, yingfei, kczarnec}@gsdlab.uwaterloo.ca
ABSTRACT
[25] or a pair of model [9]. However, individual consistency
or pairwise consistency do not guarantee global consistency.
For example, Fig. 1 shows three UML class diagrams D1,2,3 ,
where the classes connected by a dashed line are considered
to be the same class (though named differently). Each of the
three diagrams is consistent, and each pair of them is consistent, but taken together the three diagrams are inconsistent:
there is a cycle in the inheritance chain.
The example shows two issues in checking global consistency. First, we need to specify the models’ overlap. For
models like code and UML class diagrams extracted from
code, we may know their overlap by matching the elements
by name. But for models in the conceptual stage, we cannot
deduce their overlap automatically. For example, an entity
“Person” created by a business analyst and a table “Employee” existing in a legacy database may refer to the same
concept even though they have different names. Second,
when we have an overlap specification, we need an approach
to check global consistency.
Sabezadeh et al.[22] proposed to check global consistency
of homogeneous models by their merging. First, the models’ overlap is specified by a correspondence diagram: a set
of auxiliary models and mappings “in-between” the local
model, which declare some elements in different local models as being actually the same. Then all local models are
merged into one model modulo the correspondence, i.e., elements of local models declared the same in the correspondence diagram become one element. Finally, consistency of
the merged model is checked. Thus, verifying global consistency amounts to checking consistency of a single model.
However, the approach was developed for the case of homogeneous models only.
The goal of the paper is to adopt the consistency-checkingby-merging (CCM) idea for the heterogeneous situation. A
straightforward solution is to first merge all involved metamodels so that all local models become instances of the same
global metamodel; then we can merge them and check the
result wrt. the constraints in the global metamodel. Though
theoretically possible, in practice this approach leads to dealing with huge models and metamodels resulting from the
merge, which is cumbersome and not effective. We present
another approach in which merging metamodels is significantly reduced to an unavoidable minimum, and merging
models is reduced to only merging their relevant parts. Briefly,
we find common views between metamodels, project related
models to spaces of instances (overlaps) determined by those
views, and then apply the CCM approach to the homogeneous set of projections.
Software development often involves a set of models defined
in different metamodels, each model capturing a specific
view of the system. We call this set a mutlimodel, and its elements partial or local models. Since partial models overlap,
they may be consistent or inconsistent wrt. a set of global
constraints.
We present a framework for specifying overlaps between
partial models and defining their global consistency. An advantage of the framework is that heterogeneous consistency
checking is reduced to the homogeneous case yet merging
partial metamodels into one global metamodel is not needed.
We illustrate the framework with examples and sketch a formal semantics for it based on category theory.
Categories and Subject Descriptors
D.2.12 [Software Engineering]: Interoperability
General Terms
Design, Languages, Theory, Verification.
1. INTRODUCTION
Software development often
D1
D2
involves a set of heterogeneous
Class A
Class C
models, such as use cases,
process models, UML design
Class B
Class D
models, and code.
These
models are defined by differClass E
ent metamodels, and are often
built by different teams, but
collectively represent a sinClass H
Class G
D3
gle system. Due to possible
overlaps between models, individually consistent models
Figure 1:
Three
may be globally inconsistent if
globally inconsistent
taken together. Many existing
models
approaches focus on checking consistency of a single model
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
MDI’2010, October 5, 2010, Oslo, Norway
Copyright 2010 ACM 978-1-4503-0292-0/10/10 ...$10.00.
42
A
Order
OnlineOrder
10:Class 11:Name “Order”:String
:attr
:type
:end
:Generalization
:start
20:Class
:attr
:type
21:Name “OnlineOrder”:String
GA
is held, and &&N be e itself. For example, &‘Order’=11
and &&’Order’=10. In its turn, graph GM is typed over the
metametamodel graph GM M .
Any UML class diagram can be represented by a typed
graph as above but not the converse. To ensure that a
typed graph is a correct diagram, constraints must be declared and added to the metamodel. For example, (C1) a
class has only one name, or (C2) a class has only one parent
class (we assume that multiple inheritance is prohibited), or
(C3) classes with stereotype ’singleton’ cannot be instantiated with more than one object. Note that constraints can
either be imposed by a particular metamodeling technique,
e.g., constraints (C1) and (C2), or can be user-defined, e.g.,
(C3), in a suitable language like OCL. In this paper we do
not distinguish these two types and consider them abstractly
as constraints over graphs.
tA
Generalization:Node
start:arrow
end:arrow
Node
arrow
GMM
tM
GM
type:arrow
Class:Node
attr:arrow
Name:Node
String:Node
2.2
Figure 2: Graph Representation
We formulate the framework in a general way based on
category theory. This makes it applicable to a wide class of
models and metamodels, whose carrier structures are graphs,
attributed graphs, or general graph-like structures. By the
latter we mean systems of sets (nodes, arrows, arrows between arrows...) interrelated by (source and target) functions.
Realization of the approach requires several challenging issues to be solved: type-safe model matching, specification of
indirect overlap between metamodels, and inter-metamodel
constraints. We will discuss these issues in more detail
in Section 3 after we briefly outline the basics of CCMapproach in Section 2.
The rest of the paper is structured as follows. Section 4
describes our main techniques with simple examples. Section 5 presents general definitions and constructions in a
semi-formal way. Relation to other works is discussed in
Section 6. Section 7 concludes.
2. BACKGROUND: HOMOGENEOUS OVERLAP AND CONSISTENCY
We briefly review the basics of the CCM-approach, and
also show how to manage conflicts between values.
2.1
Matching models via spans
Suppose two business analysts independently build two
UML diagrams, A1 and A2 in Figure 3. To check their
global consistency, we first need to specify overlap between
the diagrams. Suppose we know that class ’OnlineOrder’ in
diagram A1 and class ’Order’ in A2 refer to the same class,
and their ’price’ attributes refer to the same attribute. We
could write the following two informal equations
OnlineOrder@A1 = Order@A2
price@A1 = price@A2 .
Note that these equations conform to the type system of
class diagrams: we match a class to a class and an attribute
to an attribute. Hence, we can represent the set of equations by a class diagram A0 shown in the middle of Fig. 3.
The question mark indicates that the name of the class is
unknown and the corresponding slot is empty. That is, the
slot node ( :Name) in the graph representing model A0 does
not have any arrow ( :type) adjoint to it (see the auxiliary
top-rightmost box in the figure). Nevertheless, it is convenient to denote the slot and its owner by &’ ?’ and &&’ ?’
like if ’ ?’ were a name.
Since elements of model A0 represent pairs of elements
(e1 , e2 ) with ei ∈ Ai , i = 1, 2, we have two inter-model mappings fi : A0 → Ai . Formally, these mappings are functions
between the corresponding graphs, e.g., f1 acts on GA0 ’s
nodes as follows:
f1 (&&’?’) = &&’OnlineOrder’, f1 (&&’price’) = &&’price’,
f1 (&’?’) = &’OnlineOrder’,
f1 (&’price’) = &’price’,
f1 (’price’) = ’price’.
Software models are typed graphs
We consider metamodels as pairs M = (GM , CM ) with
GM a graph and CM a set of constraints. A model (M ’s
instance) is a graph typed over M , i.e., a pair A = (GA , tA )
with GA a graph (typically much bigger than GM ) and
tA : GA → GM a graph mapping (which preserves the incidence relationship between arrows and nodes) such that
all constraints in set CM are satisfied.
For example, Fig. 2 shows how to represent a UML class
diagram A as a typed graph. GM is the graph representing the metamodel of UML class diagrams; GA is the graph
representing the diagram A; and tA is the type mapping.
UML classes, attributes, primitive values and generalization
relations are represented as nodes; their relationships are
captured by arrows. The value of mapping tA at an element e is given after colon, e.g., expression “10:Class” means
tA (10)=Class for node 10. Identifiers of some elements are
omitted, e.g., for all arrows. To refer to the elements, we
will use the following notation: if N is the name of an element e, let &N be the slot (owned by e) where the name
Its action on arrows is evident. Mapping f2 is defined similarly. Importantly, both mappings preserve the types of
elements, i.e., commute with the typing mappings of the
corresponding graphs. In Fig. 3 we specify mappings in a
shortened way, but precise formal specifications like above
will be needed when we consider merging.
We call a pair of mappings with a common source a (binary) span. The source (model A0 ) is called the head of the
span, mappings fi are legs and their targets (models Ai ) are
feet. Thus, an overlap of two homogeneous models is specified by a correspondence span over the same metamodel.
An overlap of n models is described by an n-ary span with
n legs and feet.
2.3
Merging and conflicts
After specifying the overlap by a correspondence span, we
merge two models into one and check whether it satisfies all
constraints defined in the metamodel.
43
A1
f1
Order
OfflineOrder
OnlineOrder
price: int
Mapping f1
OnlineOrder <-- ?
price <--price
int <-- int
A2
denotes
Order
OfflineOrder
Game
Order
price:int
:Class
:Name
:attr
g2
colimit
A∑
?
OnlineService
Mapping f2
? ---> Order
price-->price
int --> int
?
price:int
g1
Mapping g1
Order
--> Order
OnlineOrder-->{OnlineOrder, Order}
price
---> price
int
---> int
OfflineOrder ---> OfflineOrder
f2
A0
OnlineService
OnlineOrder, Order Game
price :int
Mapping g2
OnlineService <-- OnlineService
{OnlineOrder, Order} <-- Order
price <-- price
int
<-- int
Game
<-Game
Figure 3: Homogeneous Model Matching
The merge procedure consists of two parts. We first disjointly merge the graphs underlying the models, and then
glue together elements declared to be the same by the span.
The result is shown as diagram AΣ in Figure 3, in which the
merged graph has five rather than six class nodes because of
gluing. Class &&{OnlineOrder,Order} has one name slot
because the two local name slots were also glued, but this
slot holds two names since they are not (and cannot be)
equated in the head. (A precise formal specification of the
mechanism can be found in [6]). Besides graph AΣ , merging
also produces two graph mappings gi : Ai → AΣ that show
how the local models are embedded into the merge.
The merge procedure is fully automatic and can be precisely formalized in terms of the colimit operation developed
in category theory. A detailed explanation and examples of
how colimit works can be found in [21] or [6]. It follows
from general properties of colimit that the merged graph
GAΣ is correctly typed over graph GM (with M denoting
the metamodel of class diagrams).
After we have built the merged graph, we can check whether
it satisfies all constraints defined in the metamodel (say, with
a checking tool). In our example, we find two violations:
class {OnlineOrder, Order} has (i) two names and (ii) two
parent classes.
3. FROM HOMO- TO HETEROGENEOUS
MULTIMODELING: THE PROBLEMS
Existing CCM-approaches [22] handle the homogeneous
case well, but in practice software models are often heterogeneous. Business analysts, database experts, and objectoriented software designers all work with different models in
different languages, say, BPMN, ER, UML.
For instance, Fig. 4 presents three different UML models of a system developed independently by three different
teams: a class diagram cd, a statechart sc, and a sequence
diagram sd, whose simplified metamodels are shown in the
right half of the figure.
Since the models are developed independently, synonymy
and homonymy of names, and other similarities and discrepancies between models are quite possible. For example,
classes Order in the class and the sequence diagram may
refer to the same or different classes of the system. If they
refer to the same class, we need to check whether message
settled@sd refers to operation setSettled@cd. If it is the
case, we have a naming conflict (synonymy) between the
44
cd
mmCD
Order
Class
-items:List<item>
+addItem(Item i)
+setPaid(int p)
+setSettled(Date d)
-updTotal()
sc
*
Operation
settled()
addItem()
paid()
Settled
Created
mmSC
protocol
StateMachine
states *
State
cancel()
* Parameter
Class
Order
Paid
*
Property
Event
transitions *
2 *
Transition
*
trigger
Region
sd
:Order
:OrderManager
<<new>>
1
paid()
settled(date)
mmSD
Object *
addItem()
obj
lifeline
Lifeline
2
* messages
Message
*
1
Class
type
1
MsgType
Figure 4: Motivating Example
models; in addition, parameters of the message and the operation it refers to are named differently (homonymy): ’d’ in
cd and ’date’ in sd. Such conflicts are fixable by renaming,
but we also need to take into account the statechart.
There may be more serious discrepancies between the models. Suppose, for example, that the sequence diagram states
that parameter ’date’ is of type String while class diagram
declares a different type for the same parameter. This discrepancy violates the condition that an operation parameter
has a single type. This condition is stated in both metamodels (of class and sequence diagrams), but message settled
does not belong to a class diagram and operation setSettled is not in a sequence diagram. There are also semantically motivated constraints that directly regulate interaction
between models defined in different metamodels. For example, we may require that the interaction described by the
sequence diagram is to be allowed by the statechart’s state
machine. Thus, specifying overlap and checking global consistency of heterogeneous models gives rise to several specific
problems caused by heterogeneity.
A) Type-safety is important for overlap specification. In
the homogeneous situation, we allow only elements of the
same type to be matched to ensure type safety. However,
in heterogeneous cases different models are declared in different metamodels, and hence their elements have disjoint
types. We need a new method to ensure type-safety in overlap specifications.
B) Indirect overlap often occurs in heterogeneous multimodeling. For example, in class diagrams operations are
linked to their owning classes. Such linking also exists but is
implicit in sequence diagrams (through consecutive linking
Classes, Objects, Lifelines, Messages, and MsgTypes).
Hence, we cannot use direct matching to describe overlap
between sets of Class-Operation links in class diagrams and
Class-MsgType links in sequence diagrams.
C) Inter-metamodel constraints (like conformance of traces
to statecharts) are important for heterogeneous multimodeling. These constraints regulate interaction of partial models, and hence are not captured by metamodels of any of
them. Such constraints are inherently global and should be
explicitly specified.
D) Metamodel inter-relations become crucial as soon as
we consider type-safety as a fundamental requirement. The
latter implies that model interaction should be coherent with
metamodel interaction, and hence “the metamodel” of a heterogeneous multimodel is a system of metamodels together
with their relationships rather than a discrete set of isolated
metamodels. To address this new dimension of multimodeling, we need a language for specifying systems of interacting
metamodels.
4. HETEROGENEOUS OVERLAP AND CONSISTENCY BY EXAMPLES
In this section we incrementally introduce our approach.
We will consecutively consider very simple examples addressing the principle points: (i) building overlap metamodels to ensure type-safe matching, (ii) the necessity of derived
elements, (iii) inter-model constraints, and (iv) N-ary multimodeling with a non-trivial correspondence diagram.
4.1
From heterogeneous to homogeneous overlaps and type-safety
Consider the overlap between class diagram cd and sequence diagram sd in Fig. 4. Suppose we know that class
Order together with methods addItem, setSettled in cd
refer to the same elements in the system as class Order together with message types addItem, settled in sd. However, if we take the type discipline strictly, direct linking of
these elements is prohibited because their types reside in different metamodels. Hence, before matching models we need
to match their metamodels, mmCD and mmSD, as shown
in Fig. 5. Namely, we state that metaclasses Class@mmCD
and Class@mmSD refer to the same concept, and metaclasses Operation@mmCD and MsgType@mmSD are also synonyms. These declarations can be presented by a span in
the middle of Fig. 5. The head of this span is a new overlap metamodel mmCA, and two legs m1,2 map it to the two
metamodels we are matching.
Note that the overlap metamodel can be considered as
a common view between mmCD and mmSD, and mappings
m1 ,m2 as the corresponding view definitions. The view definition m1 : mmCA → mmCD can be executed for any instance of mmCD (i.e., for any class diagram) by extracting
its mmCA-portion and respectively changing its type mapping. For example, class diagram cd shown in left upper
corner of Fig. 6 (we have slightly simplified the class di-
45
agram from Fig. 4 to save space) will be translated into
diagram cd2CA typed over metamodel mmCA. We write
cd2CA = getm1 (cd) with getm1 denoting the operation of
view execution (getView ) determined by view definition m1
(in figures we omit the superscript). We will also say that
model cd is projected into the overlap space mmCA, and call
model cd2CA the mmCA-projection of cd. Since the ownership between classes and actions is not specified in the
overlap, the cd2CA-view of cd will be just a discrete set of
named elements. Note also that the view is computed along
with traceability mappings m1 : cd2CA → cd
Similarly, sequence diagram sd in the top right corner of
Fig. 6 is translated into a discrete set sd2CA = getm2 (sd) of
named elements also typed over mmCA, along with its traceability mapping m2 . Since both views are instances of the
same metamodel, we can type-safely match them and build
a span (ca1 , f1 , f2 ). This span and the corresponding merge
(colimit) are shown in the middle part of Fig. 6. They reveal
a conflict between the models: actions setPaid@cd2CA and
paid@sd2CA are linked but their names are different (in the
merge model cd+sd, the action with two names is shown by
?).
4.2
Indirect overlap
A closer inspection of the original models cd and sd shows
that the conflict above is mistaken because message ’paid’
is actually an operation of class OrderManager rather than
Order. The error occurred because our overlap model does
not capture the relationship between classes and actions (operations). To build a better overlap, we need to match the
ownership edge Class-Operation@mmCD and similar edge
Class-MsgType@mmSD. However, the latter is not directly
included into the metamodel mmSD. Nevertheless, the concepts of MsgType and Class are related indirectly via a sequence of intermediate edges: a message ends at the lifeline,
which belongs to an object, which belongs to a class. We
can compose these three edges into a new — derived — edge
Class-MsgType shown in the metamodel mmSD+ (Fig. 7)
with a dashed line. In addition, we use UML stereotypes
and prefix the names of derived elements by a slash.
In more detail, we augment metamodel mmSD with a new
element mtp (read “messageType”) coupled with its definition,
i.e., specification of some operation computing the instances
of the derived element. In our case, the operation is sequential composition of four association links leading, consecutively, from instances of Class to instances of MsgType. It
can be written in OCL as follows:
context Class
inv: self.mtp=self.objects.lifeline.messages.type
Now we declare the sameness of associations oper@mmCD
and mtp@mmSD+ by placing association act into the head
of the span as shown in Fig. 7, and defining m1 (act) = oper,
m2 (act) = /mtp. Since mappings m1 , m2 in Fig. 7 define
richer views than earlier defined mappings m1 ,m2 in Fig. 5,
projections cd2CA and sd2CA in Fig. 7 are also richer than in
Fig. 5 and include links between classes and operations. We
at once see that matching setPaid@cd2CA and paid@sd2CA
is illegal, and the corresponding “equation” must be removed
from the span. The result of merging models cd2CA and
sd2CA modulo the new span ca1 is shown in the middle
bottom of Fig. 7. It is a correct mmCA model satisfying the
constraints of mmCA: an element may have only one name,
and different actions owned by a class are named differently.
Class
Mapping m1
Class<--Class
Operation<--Action
Property
*
Mapping m2
Class-->Class
Action-->MsgType
Class
1
*
obj
lifeline
Object
1
Class
Lifeline
* oper
*
Operation
Parameter
2
* messages
m2
Action
m1
Message *
mmCD
*
type
1
MsgType
mmSD
mmCA
Figure 5: Example of metamodel overlap
Mapping f1
Order<- Order
setPaid<- ?
cd
ca1
f1
Mapping f2
Order->Order
?->paid
f2
Order:Class
?:Action
addItem:Action
sd
:OrderManager
Order
cd2CA
get
+addItem(Item i)
+setPaid(int p)
-updTotal()
m1
OrderManager:Class
setPaid:Action
addItem:Action
updTotal:Action
paid:Action
colimit
g1
Mapping m1
Order<-Order
addItem<-addItem
setPaid<-setPaid
updTotal<-updTotal
Order:Class
sd2CA
Order:Class
addItem:Action
Order:Class
updTotal:Action
paid()
m2
g2
cd+sd
Mapping g1
Order->Order
addItem->addItem
setPaid->?
updTotal→updTotal
addItem()
get
-items:List<item>
<<new>>
:Order
Mapping g2
OrderManager<-OrderManager
? <-- paid
Order<- Order
addItem <- addItem
OrderManager:Class
addItem:Action
?:Action
Mapping m2
OrderManager->OrderManager
paid->paid
Order->Order
addItem->addItem
Figure 6: Example of model overlap over the respective metamodel overlap (see Fig. 5 for view definitions)
Class
* oper
Operation
*
Mapping m1
Class <-- Class
oper<--act
Operation<--Action
Property
Object *
Mapping m2
Class --> Class
act --> /mtp
Action-->MsgType
Class
Lifeline
* act
Action
* Parameter
m2
m1
mmCD
Class
obj
lifeline
2
* messages type
Message *
<<derived>>
/mtp
*
MsgType
mmSD+
mmCA
Figure 7: Matching basic and derived meta-elements
Mapping f1
Order<-Order
addItem<-addItem
f1
ca1
f2
Order:Class
addItem:Action
Mapping f2
Order->Order
addItem->addItem
cd
sd
cd2CA
m1
Mapping g1
Order->Order
addItem->addItem
setPaid->setPaid
updTotal->updTotal
addItem:Action
setPaid:Action
OrderManager:Class
updTotal:Action
paid:Action
colimit
cd+sd Order:Class
addItem:Action
setPaid:Action
:OrderManager
Order:Class
addItem:Action
g1
Mapping m1
Order<-Order
addItem<-addItem
setPaid<-setPaid
updTota<-updTotal
sd2CA
get
+addItem(Item i)
+setPaid(int p)
-updTotal()
get
Order
-items:List<item>
Order:Class
:Order
<<new>>
addItem()
paid()
m2
g2
OrderManager:Class
paid:Action
updTotal:Action
Mapping g2
OrderManager<-OrderManager
paid<-paid
Order<-Order
addItem<-addItem
Mapping m2
OrderManager->OrderManager
paid->paid
Order->Order
addItem->addItem
Figure 8: Matching basic and derived elements (see Fig. 7 for view definitions)
46
The next section will show more interesting cases of using
derived elements in overlap specification.
4.3
mmSD
m4
m2
Inter-metamodel constraints
So far we only checked the constraints declared in the head
of the correspondence span (mmCA in our examples). These
constraints are common for both feet metamodels (mmCD
and mmSD). However, as discussed in Section 3, there may
be important constraints which reside in neither of the feet
metamodels. For example, traces of actions exhibited by a
sequence diagram must conform to the state machine specified by the corresponding statechart. We will denote this
sm meaning “Traces are to conform to the
constraint by t ♯sm
sm requires
StateMachine”. Declaration of the constraint t ♯sm
elements from both metamodels, mmSD and mmSC, and
cannot be done in either of them. Hence, a new metamodel
sm could be specified has to be built. In this
in which t♯sm
section we first show how to build such a metamodel, and
then show how to project partial models sd and sc to the
space of this metamodel instances, in which projections can
sm
be matched, merged and checked against t ♯sm
sm.
sm
To declare t ♯sm
sm, we need a metamodel encompassing metaclasses for Classes, Traces (sequences of actions), StateMachines, and related notions: States, Transitions, Events as
specified by metamodel mmCTrSM in the middle of Fig. 9.
The upper half of this metamodel is “taken” from the sequence diagram metamodel mmSD as specified by mapping
m1 in Fig. 9. Note that m1 maps class Trace@mmCTrSM to
derived class /Trace@mmSD, whose instances are sequences
of actions described by the sequence diagram and hence can
be computed by a suitable query. The lower half of mmCTrSM is taken from the statechart metamodel mmSC as
specified by mapping m2 in Fig. 9 (and we again use derived elements). Having built metamodel mmCTrSM, we
sm with its intended semandeclare in it the constraint t ♯sm
tics. We call the configuration (m1 , mmCTrSM, m2 ) a partial
span because mappings m1 and m2 are partially defined (on
the upper and lower halves of mmCTrSM resp.). In Fig. 9
and other figures below, a semi-arrow head indicates partiality of the mapping.
The next step is to project models sd and sc to the metamodel mmCTrSM. We cannot directly execute view definitions mj (j = 1, 2) because they are partial, but we can
execute them in three steps.
Step 0. We explicitly specify the domains mmCTr and mmSM
of mappings mj (j = 1, 2; see Fig. 10) on which they become
totally defined mappings m!j ; inclusion mappings ij embed
the domains into the head of the span.
Step 1. Total view definitions m!j (j = 1, 2) are executed
for models sd and sc and produce views sd2CTr and sc2CSM
over metamodels mmCTr and mmCSM resp.
Step 2. Because the two latter metamodels are included
into mmCTrSM, we may consider their instances as “partial”
instances of mmCTrSM. Formally, we compose typing mappings of models sd2CTr, sc2CSM with inclusion mappings ij ,
j = 1, 2 and get typing mappings into mmCTrSM. In Fig. 10,
these new typing mappings are marked by ∗.
The three steps are performed automatically and may be
hidden from the user, who observes the projection mappings
getm1 and getm2 as if mappings mj were ordinary total view
definitions.
Now we have two models sd2CTr and sc2CSM over the
same metamodel mmCTrSM. To finish consistency checking,
47
mmCD
m1
mmCA
m3
mmSC
=
m6
mmCTrSM
=
m5
Figure 11: Metamodel schema of the example in
Fig. 4
the user must match the models and build a correspondence
span, say, (f1 , ca2, f2 ). The head of the span is denoted by
ca2 because it is, in fact, an instance of metamodel mmCA
built in Section 4.2 (it can be formally proved). After that,
the system merges models modulo the span and checks the
result against the constraints in mmCTrSM, including the
sm
inter-metamodel constraint t ♯sm
sm. The entire procedure is
well seen in the right half of Fig. 10: data provided by the
user are shown with bullet nodes and solid arrows (and are
black), data automatically computed are shown with blank
nodes and dashed arrows (and are blue).
4.4
N-ary multimodeling and metamodel schemas
In this subsection we consider our full example involving
all three models, cd, sd and sc.
First we build a ternary span (mmCA, m1 , m2 , m3 ) specifying correspondences between operations, messages and
transitions in cd, sd, sc resp. as shown in Fig. 11; a dashed
frame indicates that the metamodel is augmented with derived elements defined by queries. Ternary span mmCA is
a straightforward extension of binary span mmCA built in
Section 4.2 with a new leg towards sc. Projecting the three
models to the head, matching them with a ternary correspondence span, say, ca3 (see Fig. 12), merging projections
modulo ca3, and finally checking the constraints against the
merge can be done in exactly the same way as in Section 4.2.
A minor distinction is that the leg ca3→getm2 (sd) is partial
because there are binary (rather than ternary) correspondences like (setPaid@cd, paid@sc) that do not involve sd’s
elements; colimit operation consumes such correspondences
as well.
The second point of consistency checking is at the span
sm is to be checked
(mmCTrSM, m4 , m5 ) where constraint t ♯sm
as explained in Section 4.3. However, when we consider all
three models, the correspondence span ca2 between projections getm4 (sd) and getm5 (sc) can be derived from the span
ca3 rather than specified independently. Indeed, we have
mapping m6 that sends nodes Class and Action and edge
act between them to the corresponding elements in mmCTrSM. By applying the retyping procedure explained in Section 4.3, we project the span ca3 into mmCTrSM and get a
span ca2 as shown in Fig. 12 (where the block arrow rtpm6
denotes the retyping operation). After the span ca2 is computed, we proceed exactly as described in Section 4.3 and
sm
check the constraint t ♯sm
sm.
An important property of the metamodel schema in Fig. 11
is commutativity of the two triangle diagrams (note two =-
mmSD
Object
*
Class
<<derived>>
/tr <<derived>>
*
lifeline
Lifeline
2
* messages
Message *
<<derived>>
/mtp *
type
/Trace
<<derived>>
/msg
MsgType *
mmCTrSM
m1
Mapping m1
Class
<--Class
MsgType<--Action
/mtp <-- act
/tr <-- tr
/Trace <-- Trace
/msg <-- msg
*
tr
msg
*
Class
m2
Trace
*
Action
act
trig
protocol
t#sm
StateMachine
* states
2 *
State
* trans
*
Transition
mmSC
Mapping m2
Class→Class
protocol→protocol
act → /events
Action→Event
StateMachine →
StateMachine
states→ states
trans --> trans
State→State
...
Class
protocol
StateMachine
states *
State
<<derived>>
/events *
trigger
trans
*
2 *
Transition
Event
*
Region
Figure 9: Specifying inter-metamodel constraints
mmCA
Class
*
actions
Action
*
Trace
mmCTr
Inter-metamodel
constr. C
mmSD
m!1
mmCSM
mmCTr
m2
mmCTrSM
*
*
Action
*
actions
Action
*
ge
*
actions
Metamodels
mmSC
i2
i1
m1
Class
m!2
t
ge
t
mmCSM
Class
*
f1
trigger
* states
2 *
sc2CSM
m!2
j1
sd
* transitions
*
Transition
Models
f2
m!1 sd2CTr
protocol
StateMachine
State
ca2
lim
co
sd +ca2 sc
j2
sc
satisfies C?
Figure 10: Verifying inter-metamodel constraints
mmCD mmSC mmSD
cd
sc
mmCA
The simple example above shows how local model interaction is governed by the multimodel schema specifying metamodels’ inter-relationships. The example also demonstrates
that N-ary multimodeling may exhibit sufficiently complex
metamodels schemas bearing their own constraints like commutativity.
mmCTrSM
User
Specified
sd
getm4
getm5
Automatically
Generated
getm1
getm3
getm2
5. MAKING MULTIMODELING PRECISE:
A GENERAL FRAMEWORK
ca2
The three basic ingredients of our approach are (i) metamodels and their mappings, (ii) models and their mappings,
and (iii) a mechanism of model translation from one metamodel to another. We build a (minimal in a sense) mathematical framework allowing to define these concepts and
their inter-relations in Section 5.1. In Section 5.2 we show
that global consistency checking can be indeed realized in
this framework. In Section 5.3 we show how the abstract
framework of Section 5.1 can be implemented with constructs close to modeling practice: typed structures, query
and constraint languages.
Due to space limitations, the presentation is very brief and
semi-formal: we show how the concepts could be formally
defined rather than present real formal definitions. We use
simple category theory concepts without explanation, and
refer to basic concepts of the institution theory [14] — an
abstract framework for logic and model theory.
rtpm6
ca3
User
Specified
Figure 12: Global consistency checking of the example in Fig. 4
labels):
(=)m
m 6 ; m 4 = m 2 and m 6 ; m 5 = m 3 .
Because view execution and retyping preserve metamodel
mapping composition (we will formalize these properties in
Section 5), we have commutativity for view execution mappings as well:
(=)get
getm4 ; getm6 = getm2 and getm5 ; getm6 = getm3 .
5.1
Abstract multimodeling framework
An abstract multimodeling framework F abstr is a tuple of
constructs defined below.
1) A category MMod whose objects are called metamodels
Hence, we have only one projection of sequence diagram sd
to the instance space of mmCA, and only one projection of
sc to the same space.
48
with M ∈ MMod a metamodel and A a diagram in [[ M ]];
the latter can be thought of as a family of models together
with a system of correspondence spans. A multimodel is condef
sistent if colimit AΣ = ΣA (which always exists in [[ M ]]? )
satisfies M ’s constraints, i.e., AΣ ∈ [[ M ]].
A heterogeneous multimodel is a tuple
and arrows are metamodel mappings.
2) Each metamodel M is assigned with two categories,
one being a subcategory of the other, [[ M ]] ⊂ [[ M ]]? . Intuitively, objects of [[ M ]]? are structures properly typed over
M but perhaps violating M ’s constraints (hence the question mark); we will call them structural instances. Objects
of [[ M ]] are (legal) models: structural instances of M satisfying, in addition, all constraints in M .
We require all categories [[ M ]]? to be closed under colimits (merging). This is the case for many classes of structures
carrying metamodels and models like graphs or attributed
graphs. But we do not require this property on [[ M ]]. Our
examples above show that in practically interesting situations [[ M ]] is not closed under colimits.
3) Any metamodel mapping m : M → N ::MMod is assigned with a getView functor getm : [[ N ]] → [[ M ]] that maps
in the opposite direction (think of m as a view definition and
getm as a view execution).
Moreover, if m = 1M is the identity mapping of metamodel M , then getm is the identity functor on [[ M ]], and for
m1
m2
two consecutive mappings M ✲ N ✲ O,
AA = (A1 :M1 . . . An :Mn )
getm1 ;m2 = getm2 ; getm1 : [[ O ]] → [[ M ]]
(a sequentially composed view definition is executed consecutively).
4) A subcategory MModinc ⊂ MMod of inclusion mappings is fixed: it has the same objects but fewer mappings
than MMod. A formal inclusion mapping i : M → N ::MModinc
is to be thought of as inclusion of metamodel M into a bigger
metamodel N .
Any inclusion i : M → N is assigned with a retyping functor rtpi : [[ M ]]? → [[ N ]]? (think of retyping described in Sections 4.3-4).
Note that in contrast to operation get, rtp maps structural
instances (particularly, models) to structural instances (not
necessarily models): if even an instance A is an M -model, we
cannot guarantee that rtpi (A) would satisfy all constraints
in N .
Similarly to get, we require rtp1 M to be the identity functor on [[ M ]]? , and for two consecutive mappings m1 , m2 as
above, rtpm1 ;m2 = rtpm1 ; rtpm2 : [[ M ]]? → [[ O ]]? .
We will write an abstract multimodeling framework in a
short form as a triple F abstr = (MMod, get, rtp) assuming
that the [[ ]]-part of the construction is “included” into get,
and the [[ ]]? and MModinc parts are “included” into rtp.
Operations get and rtp together provide model translation
over partial mappings. A partial mapping m : M ⇀ N between metamodels (note the semi-arrow head) is, formally,
im
fm
✲ N with Dm ⊂ M a metaDm
a diagram M ✛
model called the domain of m (while M is the source of
m), im is the corresponding inclusion, and fm is an ordinary (total) metamodel mapping (the function of m). Evidently, sequential composition getfm ;rtpim provides a functor [[ M ]]? ← [[ N ]]? translating N ’s structural instances and
their mappings into M ’s ones. We will denote this composition by getm (so that the actual meaning of getm depends
on whether m is a total or a partial mapping).
5.2
Multimodels and their consistency
Let F abstr = (MMod, get, rtp) be an abstract multimodeling framework.
A homogeneous multimodel over F abstr is a pair (M, A)
49
with Mi ∈ MMod and Ai a homogeneous multimodel over
Mi , i = 1..n. Consistency of a heterogeneous multimodel is
much more involved than in the homogeneous situation, and
we will begin with a simpler case of discrete multimodels, for
which each diagram Ai is actually a set of models without
mappings between them.
The algorithm for checking global consistency of a discrete heterogeneous multimodel AA is as follows. We begin
with specifying a system of common views (overlaps) between metamodels Mi . For simplicity, we assume that such
a system amounts to a set M of total and partial spans
like that one shown in Fig. 11 if we remove mapping m6 between spans themselves. Global consistency of AA is checked
at the heads of these spans. That is, for each span S in M
we perform the following procedure.
Let H be S’s head. First, we project to the space [[ H ]]? of
structural H-instances all models Ai , whose metamodels Mi
are reachable from H by the legs of the span. If the span is
total, projecting is provided by the view mechanism. If the
span is partial, projecting needs both view execution and
model retyping as explained above. In this way we obtain a
set of instances AH ⊂ [[ H ]]? .
Second, instances in AH are matched by a correspondence
diagram E H (for example, think of spans ca2 or ca3 in our
examples). Note that E H -data are provided by the user and
are, in fact, part of the multimodel’s state.
Third, all instances in AH are merged modulo the correspondence diagram E H into a structural instance
def
(AΣ )H = (ΣAH /E H ) ∈ [[ H ]]? .
Finally, we check whether (AΣ )H ∈ [[ H ]], i.e., whether it
satisfies all constraints declared in H.
A general multimodeling case with Ai being diagrams
rather than sets can be treated similarly. The key is that
translation operations get and rtp are functors, that is, they
translate not only instances but also instance mappings, and
hence correspondence diagrams as well. Then the projection
AH ⊂ [[ H ]]? will be a diagram rather than a set of instances,
and diagram E H will provide a second level correspondence
structure. As colimit operation consumes any sort of input
diagrams, the algorithm works well for the general case too.
Another generalization of the algorithm, for which the
metamodel schema is more complicated than a set of spans,
is harder and is a work in progress.
5.3
Concrete multimodeling framework
In a nutshell, a concrete multimodeling framework consists of three components: (i) a base category G of graph-like
structures to be thought of as the carriers of metamodels and
models, (ii) a constraint language C together with binary relations |= of satisfying a constraint by a model, and (iii) a
query language Q together with operations of query execution over a model. In more detail (but still very briefly with
many important conditions skipped), a concrete framework
is given by the following constructs
1) G-objects are to be thought of as graphs, or manysorted (colored) graphs, or attributed graphs [11]. The key
point is that they are definable by a metametamodel itself
being a graph with, perhaps, a set of equational constraints.
In precise categorical terms, we require G to be a presheaf
topos [3], and hence possessing limits, colimits, and other
important properties. We will call G-objects “graphs”.
For a “graph” G thought of as a metamodel, an instance
of G is a pair A = (DA , tA ) with DA another “graph” and
tA : DA → G a mapping (arrow in G) to be thought of as
typing. An instance mapping f : A → B is a “graph” mapping f : DA → DB commuting with typing: f ; tB = tA . This
defines a category [[ G ]] of G-instances.
Any mapping m : G′ → G determines a functor
pbm : [[ G ]] → [[ G ′ ]] built with pullback operation in the standard way (see e.g.[15, p.48]).
2) Constraints are defined exactly like in the institution
theory. We postulate a functor C : G → Sets and a binary
relation |=G ⊂ [[ G ]]×C(G) for every “graph” G. For an instance A ∈ [[ G ]] and a constraint c ∈ C(G), we write A |=G c
for (A, c) ∈|=G .
3) Queries are an original part of the definition. We begin with a functor Q : G → G of query specifications. For
a “graph” G ∈ G, the “graph” Q(G) ⊃ G is to be thought
of as “graph” G augmented with definitions of derived elements. (Actually we require Q to be a monad [3]). Functor
Q also acts on constraints: for a “graph” G and a set of constraints C ⊂ C(G) over G, there is a set Q(C) ⊂ C(Q(G))
of constraints derived from C.
Semantics of query specifications
DA ⊂ ✲ D[[ Q ]](A)
is given by an operation [[ Q ]] that
maps G-instances to Q(G)-instances
tA
t[[ Q ]](A)
as specified by the inset diagram on
❄
❄
the right (two derived arrows are
G ⊂ ✲ Q(G)
dashed and the derived node is underlined). We require this diagram to be a pullback square,
which means that “graph” DA is the inverse image of “graph”
D[[ Q ]](A) , that is, the original data are not changed by the
query execution.
To ensure that derived instances satisfy derived constraints,
we require the following to hold for any instance A:
(QC)
A |=G C implies [[ Q ]](A) |=Q(G) Q(C).
Finally, we requite operation [[ Q ]] to act also on instance
mappings: for any injective arrow f : A → B in [[ G ]], there
is defined an injective arrow [[ Q ]]f : [[ Q ]](A) → [[ Q ]](B) in
[[ Q(G) ]]. In the database literature, this property of a query
language is called monotonicity, and it is known that queries
without negation are monotonic [18].
From these data we can derive an abstract framework
F abstr along the following lines. We first fix a subcategory
G◦ ⊂G of finite “graphs” to be the carriers of metamodels. A
metamodel is a pair M = (GM , CM ) with GM ∈ G◦ a carrier graph and CM ⊂ C(GM ) a set of constraints. Structural
by composition (like in the example of Section 4.3) and also
satisfy necessary conditions. With accurate formal definitions, it can be proved that every concrete multimodeling
framework gives rise to an abstract multimodeling framework. Hence, the algorithm of global consistency checking
can be used with a concrete framework as well.
6. RELATED WORK
Specifying overlaps of homogeneous models by correspondence spans is known for a long time [13, 5, 4, 17]. Close
relations between consistency checking and model merging
were noticed in [7] for behavioral, and in [22] for structural
models. A large body of work in this direction was done
in databases in the context of view integration, where they
worked mainly with ER-diagrams [23]; a generalization for a
much more expressive graph-based language was developed
in [5]. A serious limitation of these approaches was that they
work for the homogeneous case only because it was unclear
how to merge heterogeneous models.
Consistency of heterogeneous models is a central issue of
the living with inconsistency frameworks [20, 24, 19, 10].
Their basic idea is to translate all models and constraints
into a common logical formalism, and check if there are conflicts between logical formulas. Although these approaches
handle many cases in heterogeneous multimodeling, the configuration of model overlap (which may be very intricate as
our examples show) is flattened and hidden in arrays of formulas. As a result, none of the approaches fully covers heterogeneous multimodeling: they mainly handle well-defined
cases where elements are matched by names, or only pairwise
cases. In contrast, the structure of inter-model relationships
is made visible and essentially used in our approach.
Several approaches also transform models to aid model
merging and consistency management. Egyed [8] proposes a
flexible framework based on model transformation and mapping; however, it is the user’s responsibility to use them
correctly. Ehrig et al. [12] use graph transformation to derive views from a reference model, and integrate modified
views using colimit. Compared to our work, their work requires users to define the transformation manually. Jurack
and Taentzer [16] consider multimodeling (they say composite models) in a distributed environment. Their setting is
mainly operational and is based on graph transformations.
None of the approaches handles inconsistent views.
Many researchers focus on discovering traceability links
between heterogeneous models [2] and discovering differences between homogeneous models [1]. Their results can be
integrated into our approach as a means for an automated
construction of correspondence spans.
7. CONCLUSION
The paper describes a general approach to global consistency checking of heterogeneous multimodels. The approach
is based on finding common views between metamodels of
the models involved, projecting all models to these views,
merging projections and checking the result against the constraints specified in the view. We have shown that type-safe
matching, indirect model overlap, and inter-metamodel constraints can be uniformly managed along the lines described.
The approach gives rise to a novel framework for heterogeneous multimodeling, in which a network of interrelated
metamodels — the metamodel schema — plays the central
def
instances of M are instances of GM , i.e., [[ M ]]? = [[ GM ]],
and models of M are GM ’s instances satisfying CM .
Metamodel mappings are G-arrows of the form m: GM →
Q(GN ) (Kleisli arrows of monad Q), which are compatible
with constraints: C(m)(CM ) ⊂ Q(CN ). Any such mapdef
ping determines a functor getm = [[ Q ]]; pbm : [[ N ]] → [[ M ]],
which satisfies conditions postulated in the definition of the
abstract framework. The retyping functors rtp are defined
50
[8] A. Egyed. Heterogeneous view integration and its
automation. PhD thesis, University of Southern
California, 2000.
[9] A. Egyed. Instant consistency checking for the UML.
In ICSE, pages 381–390, 2006.
[10] A. Egyed. Fixing inconsistencies in UML design
models. In ICSE, pages 292–301, 2007.
[11] H. Ehrig, K. Ehrig, U. Prange, and G. Taenzer.
Fundamentals of Algebraic Graph Transformation.
2006.
[12] H. Ehrig, R. Heckel, G. Taentzer, and G. Engels. A
combined reference model- and view-based approach
to system specification. Int. Journal of Software and
Knowledge Engeneering, 7:457–477, 1997.
[13] J. L. Fiadeiro and T. S. E. Maibaum. Interconnecting
formalisms: Supporting modularity, reuse and
incrementality. In SIGSOFT FSE, pages 72–80, 1995.
[14] J. Goguen and R. Burstall. Institutions: Abstract
model theory for specification and programming.
Journal of ACM, 39(1):95–146, 1992.
[15] B. Jacobs. Categorical logic and type theory. Elsevier
Science Publishers, 1999.
[16] S. Jurack and G. Taentzer. Towards composite model
transformations using distributed graph
transformation concepts. In MoDELS, pages 226–240,
2009.
[17] H. Liang, Z. Diskin, J. Dingel, and E. Posse. A general
approach for scenario integration. In MoDELS, pages
204–218, 2008.
[18] H. Liefke and S. Davidson. View maintenance for
hierarchical semistructured data. In DaWaK, pages
114–125, 2000.
[19] C. Nentwich, W. Emmerich, and A. Finkelstein.
Consistency management with repair actions. In
ICSE, pages 455–464, 2003.
[20] B. Nuseibeh, J. Kramer, and A. Finkelstein.
Viewpoints: meaningful relationships are difficult! In
ICSE, pages 676–683, 2003.
[21] M. Sabetzadeh and S. M. Easterbrook. View merging
in the presence of incompleteness and inconsistency.
Requir. Eng., 11(3):174–193, 2006.
[22] M. Sabetzadeh, S. Nejati, S. Liaskos, S. M.
Easterbrook, and M. Chechik. Consistency checking of
conceptual models via model merging. In RE, pages
221–230. IEEE, 2007.
[23] S. Spaccapietra and C. Parent. View integration: A
step forward in solving structural conflicts. IEEE
Trans. Knowl. Data Eng., 6(2):258–274, 1994.
[24] R. V. D. Straeten, T. Mens, J. Simmonds, and
V. Jonckers. Using description logic to maintain
consistency between UML Models. In UML, pages
326–340, 2003.
[25] J. Warmer and A. Kleppe. The Object Constraint
Language. Precise modeling with UML.
Addison-Wesley, 2000.
role.
The framework has a number of advantages. First, heterogeneous consistency checking is reduced to homogeneous
with a minimal amount of metamodel merging; the latter is
unavoidable if we want to treat inter-metamodel constraints
yet we work as locally as possible. Second, the framework
is applicable to a wide class of models and metamodels satisfying not too restrictive conditions formulated in Section
5. Third is the adaptability of the framework to the living with inconsistencies paradigm: conflicts between models can be recorded in the heads of the correspondence spans
and resolved later. Forth, heterogeneous multimodeling becomes directly related to the institution theory and hence
to a source of important (and hard to prove) mathematical results about interrelation of logical theories and their
models.
However, the approach still needs practical, and in part
also theoretical, validation. On the practical side, the main
question is how effectively a multimodeling tool based on
the framework could be implemented. On the theoretical
side, the cornerstone of the approach is a default assumption
that our “as local as possible” consistency checking is equivalent to direct global consistency checking. By the latter we
mean merging all metamodels into one global metamodel
MM , then all partial models becomes partial instances of
MM , whose joint consistency can be checked by a homogeneous CCM-algorithm. There are strong formal arguments
justifying this assumption but an accurate formal proof is
still to be complete.
An important theoretical line of future work is to develop a
useful classification of heterogeneous multimodels. We may
classify multimodels by the type of their metamodel schema:
whether it is a plain collection of spans, or there are spans
over spans over spans..., or perhaps even more complex configurations. Types of mappings in the metamodel schema
are also essential: whether they are plain projections or
complex views involving non-trivial queries. Complexity of
queries involved in the metamodel schema of a multimodel
is its important property, and many useful results can be
found in the database literature. Defining multimodeling in
abstract mathematical terms along the lines described in the
paper would allow useful interaction of the two fields.
8. REFERENCES
[1] M. Alanen and I. Porres. Difference and union of
models. In UML, pages 2–17, 2003.
[2] G. Antoniol, G. Canfora, G. Casazza, A. De Lucia,
and E. Merlo. Recovering traceability links between
code and documentation. IEEE Transactions on
Software Engineering, 28(10):970–983, 2002.
[3] M. Barr and C. Wells. Category theory for computing
science. Prentice Hall, 1995.
[4] P. Bernstein and R.Pottinger. Merging models based
on given correspondences. In VLDB, 2003.
[5] B. Cadish and Z. Diskin. Heterogenious view
integration via sketches and equations. In ISMIS,
pages 603–612, 1996.
[6] Z. Diskin. Model synchronization, mappings, tile
algebra, and categories. In GTTSE’09. Springer. To
appear.
[7] S. M. Easterbrook and M. Chechik. A framework for
multi-valued reasoning over inconsistent viewpoints. In
ICSE, pages 411–420, 2001.
51
Anticipating Unanticipated Tool Interoperability
using Role Models
Mirko Seifert
Christian Wende
Uwe Aßmann
Technische Universität
Dresden
Software Technology Group
Dresden, Germany
Technische Universität
Dresden
Software Technology Group
Dresden, Germany
Technische Universität
Dresden
Software Technology Group
Dresden, Germany
mirko.seifert@tudresden.de
c.wende@tu-dresden.de
uwe.assmann@tudresden.de
ABSTRACT
1. INTRODUCTION
The interoperability of tools heavily relies on their ability to
exchange shared data. While the definition of standardised
metamodelling languages such as the Essential Meta Object
Facility (EMOF) [23] has substantially simplified the task of
reading and persisting arbitrary domain data, there are still
open issues concerning the integration of domain abstractions (metamodels) used by different tools. For example,
accessing common data by shared metamodels is limited,
because of the lack of first-class support for metamodel composition. Data that is processed using multiple tools must
be either stored in a common abstraction—which introduces
a strong coupling of the involved tools—or is replicated (e.g.,
represented in different tool formats)—which introduces the
need for tedious synchronisation.
In this paper we present how role-based metamodelling can
overcome these limitations and provide a formalism to enable tool interoperability by role composition. Based on a
running example, the implications of the current problems
of tool integration are shown and their resolution based on
role modelling is discussed.
The interoperability of heterogeneous tools has been studied for a long time as it is vital to increase the productivity
gained by software systems. Interoperability can be divided
into two aspects. First, to enable interoperability, software
must be able to access shared information. Second, the behaviour of different systems needs to be integrated.
To address the former issue—sharing data across tools—
both syntax and semantics of the data that tools operate
on must be known and formally defined. In this context,
metamodelling languages (e.g., EMOF) have significantly
improved the specification of data structures. By using a
standardised formalism to describe and exchange the abstract syntax of domain data, a big obstacle for tool interoperability has vanished. The explicit representation of
domain concepts in metamodels allows tool integrators to
understand the data they are dealing with. While this may
appear to be a simple requirement, it is not fulfilled by tools
that use internal or implicit metamodels.
Besides the explicit representation of concepts using metamodels, standardised serialisation allows to retrieve and persist data uniformly. The mapping from the abstract to the
concrete syntax (i.e., the actual representation of data in
files) is defined by the metamodelling facilities rather than
individual tools. The tedious task of reading custom file
formats to obtain the contained data becomes superfluous.
Despite all gained benefits, metamodelling does not solve
all problems related to tool interoperability. This is particularly caused by the need for explicit anticipation of future
interoperability between tools. If this concern is neglected
during the creation of metamodels, later integrations are
hard to accomplish.
Object-oriented metamodelling languages like EMOF do
allow to proactively handle tool interoperation by designing
metamodels in such a way that future extension and interoperation is anticipated. Distinct patterns do exist, which
allow the extension of metamodels [7]. However, the existence of these pattern does not enforce tool developers to
use them. As a result, the decision whether to ease future
interactions between tools or not is completely left to the
tool designer. Usually this yields limited anticipation for
extension.
In an ideal world all future tool integrations would be
known beforehand, which would substantially ease the task
of anticipating every interaction between tools. However,
Categories and Subject Descriptors
D.2.12 [Software Engineering]: Interoperability—Data
mapping; I.6.5 [Simulation and Modeling]: Model Development—Modeling methodologies
General Terms
Design
Keywords
role modelling, tool integration
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
MDI2010 October 5, 2010, Oslo, Norway
Copyright 2010 ACM 978-1-4503-0292-0/10/10 ...$10.00.
52
new tools are created and existing ones evolve dynamically,
which renders perfect anticipation impossible. One can also
say that tools must anticipate the unanticipated.
An approach to handle unanticipated tool interoperation
retroactively is to apply transformations. Transformations
convert data created by one tool to a different representation
that is understood by another tool. However, in this case
each tool keeps its own local version of the data, which yields
to data replication. This can in turn lead to inconsistencies
and increased memory consumption.
Instead of transformations, which copy data, one can also
use views, that convert data as needed without keeping replicated data. Unfortunately, such interoperability is often
achieved in an ad-hoc manner. Adaptors are implemented
to present existing data in a specific view, which can be used
by some tool. Thus, the structure of this view (i.e., the interface required by a tool to access data) is not separated
from the binding of this interface to some data source. This
lack of separation of concerns renders tool integration fragile. If one is not aware of the presence of an access interface,
because it is not explicit, changes are applied lightheaded
and all existing adaptations need to be adjusted.
To overcome the limitations outlined above and to simplify tool interoperability, we propose to use role models as
an extension of object-oriented metamodelling. The concept
of roles has been originally presented in [18] and applied to
the design of frameworks in [19]. In [18] a role model is
defined as a unit to isolate an area of concern. To establish interoperability, the concern that each tool deals with is
captured by a role model.
The contributions of this paper are the following. First,
a set of requirements for flexible and durable tool integration are formulated. Second, object-oriented metamodel
integration and model transformation—two common techniques for proactive and retroactive tool integration—are
analysed w.r.t. these requirements. In particular, the problems induced by requirements that are not met by the two
approaches are depicted. Third, a role-based approach for
the specification and integration of metamodels is presented,
which overcomes the discovered limitations.
This paper is organised as follows. After presenting a running example in Sect. 2, we will analyse the problems of existing proactive and retroactive tool integration techniques
in Sect. 3. The usage of role models to leverage tool integration will be discussed in Sect. 4. We will compare our
approach with related work in Sect. 5 and conclude with
Sect. 6.
Textual State Machine Editor
State
name : String
from
to
type
Transition
condition: String
StateType
PLAIN
INITIAL
FINAL
2D Shape Renderer
Shape
x, y, size: Integer
label: String
Graph Analysis Tool
Kind
CIRCLE
RECTANGLE
LINE
Node
source Edge
invalid:Bool
Colour
target
WHITE
BLACK
RED
Figure 1: Example scenario
Third, we want to check our state machines statically
against certain well-formedness criteria. For this purpose,
a generic graph analysis tool shall be used. This tool is also
not aware of concepts specific to the state machine domain,
but based on nodes and edges. Since the goal of the analysis
tool is to find invalid elements, nodes do carry a boolean
attribute invalid.
Using the graph analysis tool we would like to check that
initial states do not have incoming transitions and that final states do not have outgoing ones. We might also restrict the total number of transitions for one state to 10,
because higher numbers indicate a bad design of our state
machine model (see [9] for more details about graph constraints and [27] for a similar example).
One can see that each of the tools owns a specific metamodel, which captures the domain abstraction that is appropriate for the task of the tool. The depicted metamodels are
independent of each other. This gives tool developers the
flexibility to choose their domain abstractions freely. Also,
changes made to one metamodel do not have implications
to other metamodels. From the tool developers perspective,
these are nice properties of the design depicted in Fig. 1.
However, our goal is to integrate the three tools. We do
want to render state machines, which is why states and transitions need to be shapes as well. We would also like to apply
the graph analysis tool to state machines. Therefore, states
need to represent nodes and transitions need to be handled
like edges. In other words, our tools need to share common
data.
Besides data that is shared across all tools (i.e., states
and transitions), there is also data that is required by a
subset of the tools only. For example, we might want to
show errors stemming from the analysis of a state machine
in red colour when rendering state machines. Thus, the tool
integration must allow the analysis tool to change the colour
of problematic elements.
The question raised by this example, is how the three
tools should be built in order to allow their interoperability
while preserving a high degree of independence. Interoperability includes the interactions mentioned above as well as
other interactions, which were not anticipated at development time. The goal of this work is to illustrate how one
can incorporate the tool interoperability concern at tool development time and thereby ease unforeseen integrations in
the future.
2. RUNNING EXAMPLE
We illustrate our approach by a simplified, exemplary scenario of tools we would like to interact. Suppose, we like to
create, visualise and validate state machines. To achieve
this, we want to integrate three tools that use tool-specific
data abstractions. This scenario is depicted in Fig. 1.
First, we use a textual editor for state machines. This
editor is aware of the domain concepts of state machines
(i.e., states and transitions). We will restrict ourselves here
to a simplified representation of the state machine domain.
Second, we like to visualise our textual state machines
graphically, which is why a tool that can layout and render two-dimensional shapes is employed. This tool was not
specifically designed to render state machines—it rather uses
the concepts of shapes, coordinates and colours.
3. PROBLEM ANALYSIS
From the previously discussed example we derive the following requirements for tool implementation and integration:
R1 Appropriate Abstraction Efficient tool implementation requires an appropriate data abstraction. Therefore,
each tool should operate on a tool-specific data abstraction
53
the integrated metamodel in the integrating one. This is a
more invasive influence on the structure of the integrating
tool metamodel. For example, to integrate the shape metamodel from Sect. 2 with the state machine metamodel one
could define shapes as subtypes of states and transitions.
However, this enriches shapes with source and target references, which is not part of the abstraction used by the shape
rendering tool.
R2 Tool Independence Both delegation and inheritance introduce a strong coupling of the integrating metamodel and the integrated one. Consequently, the integrating
tool can hardly be reused in different contexts and strongly
depends on the integrated tool. In the basic scenario where
a domain feature is renamed (e.g., changing condition to
guardExpression), all the integrating metamodels and tools
must be changed.
R3 Shared Data Both techniques provide means to access and manipulate data among tools. Delegation can be
used to access and adapt data from the integrated metamodel. Integration using inheritance is beneficial to adapt
a shared abstraction from the outside, but only at predefined places. Thus, it only allows for anticipated extension.
For example, if a tool requires floating point coordinates for
shapes, changing the type of the attributes x and y of class
Shape is not possible.
R4 Tool Interaction Both techniques only allow for tool
integration that is implemented or anticipated at tool development time. Delegation can be used to navigate from an
integrating metamodel to the integrated one. Navigation
in the opposite direction is not possible as the integrated
metamodel can not be extended from the outside. While
metamodel integration by inheritance can be used for anticipated extension of the integrated metamodel, it is not
applicable for sharing data between several tools as subclasses are hard to propagate from one tool metamodel to
another. For example, if two subclasses—BorderedShape
and FilledShape—are introduced by two different tools,
they can not process each others shapes since the objects
can not be casted to the opposite class.
In addition, these integration mechanisms mix two facets
of tool integration. The tool metamodels are implemented
to present shared data in a tool-specific view. Thus, they
intermingle the structure of this view (i.e., the interface required by a tools to access data) with the binding of this
interface to the shared data structure. This lack of separation of concerns renders tool metamodels integrated with
delegation and inheritance very fragile w.r.t. changes in the
shared abstraction.
(tool metamodel).
R2 Tool Independence Tools should be unaware of each
other and reusable in different constellations. Thus, individual tool metamodels should be loosely coupled.
R3 Shared Data Tools should be able to access and
manipulate shared data. This requires means for integrating
and adapting different data abstractions.
R4 Tool Interaction Tools should be able to interact
if needed. Interaction is required when functionality of one
tool relies on data provided by another tool. Again loose
coupling is preferred.
These requirements seem to contradict: Tool implementation should be independent (R2) and rely on suitable abstractions (R1), but on the other hand their interoperation
requires sharing of common data (R3) and tool interaction
(R4). As discussed for the example in the previous section,
metamodelling enables the implementation of adequate and
tool-specific metamodels. However, regarding tool integration current metamodelling approaches have several drawbacks. In the following we will analyse and discuss two forms
of tool integration: Proactive Tool Integration, where the
metamodels of the tools to integrate are directly connected
at tool development time and Retroactive Tool Integration,
where the metamodels are untouched and external transformation are implemented during tool integration to synchronise data between tools.
3.1
Proactive Tool Integration
A common approach to access shared data is to directly
integrate the metamodel of one tool in the metamodel of
another. Such approaches have also been studied before by
other communities to integrate XML schemas [13].
We call these approaches proactive tool integration as they
require the implementation or at least anticipation of integration during the development of the integrated tools.
Object-oriented metamodelling supports two ways for such
invasive integration of metamodels—delegation and inheritance. With delegation one can import and reference existing metaclasses and thereby decorate existing elements with
additional data. By using inheritance one can add new subclasses to imported ones and thereby access, refine and reuse
the data of the original classes.
Delegation and inheritance is also used in various patterns that prepare anticipated metamodel extension. The
pattern presented in [7] works with association classes to
link metamodels that shall be integrated. These are inherited from abstract association classes to provide a generic
protocol for navigating associations which enables flexible
addition of new links and, thus, flexibility for later metamodel extension. In [11] Emerson et. al differentiate further
patterns like metamodel interfacing or class refinement for
proactive metamodel integration.
As depicted in Fig. 2, both delegation and inheritance
tightly couple the integrating metamodel to the integrated
one. In the following we discuss the details of this integration
approach regarding the introduced requirements:
R1 Appropriate Abstraction The two techniques for
proactive tool integration require an adaptation of the integrating tool metamodel. Integration by delegation imposes
no structural restrictions on the integrating metamodel and,
thus, does not impair the abstraction used in the tool metamodel.
Integration by inheritance integrates the abstraction of
3.2
Retroactive Tool Integration
A second approach for tool integration is to implement
metamodels independently and to apply transformations to
synchronise data from one representation to another. This
technique is non-invasive w.r.t. the tool’s metamodels and,
thus, typically used to integrate existing tools retroactively.
As depicted in Fig. 2, retroactive tool integration does not
introduce a dependency between the integrated metamodels.
Regarding our requirements for tool integration it has the
following properties:
R1 Appropriate Abstraction As no tool metamodel is
affected by retroactive tool integration this requirement is
satisfied.
R2 Tool Independence Model transformations provide
54
Proactive Integration
Technique
Inheritance, Delegation
Tool1 MM
2D Shape Renderer
Retroactive Integration
Shape
x, y, size: Integer
label: String
Transformation
Tool1 MM
Tool2 MM
Tool3 MM
Appropriate - tool metamodel needs to be
Abstraction adapted
Tool - strong coupling of tools
Independence
Shared Data + shared data among all
integrated tools
Tool +/- only anticipated interaction
Interaction
supported
Tool2 MM
Graph Analysis Tool
Kind
CIRCLE
kind RECTANGLE
LINE
Node
invalid:Bool
colour
Colour
Tool3 MM
Edge
target
WHITE
BLACK
RED
+ tool metamodels unaffected
source
+ tools are not coupled
Types Notation
- data replication
- synchronisation neccessary
- transformations hinder
interaction among several tools
Features Notation
name
name: type
Role Type
Figure 2: Analysis of existing Tool Integration Techniques
Enum
Attribute
Reference
Figure 3: Example of Role-based Tool Metamodels
role-model
flexible means to connect arbitrary tool metamodels, while
the involved tools stay completely independent.
R3 Shared Data Model transformations copy and replicate data from one metamodel to another. If they are implemented in a bi-directional fashion, data sharing can be
emulated. This is a very flexible coupling mechanism. Tools
can easily be integrated by providing new transformations.
However, if several tools are involved, all transformations
need to be synchronised. Furthermore, transformations are
typically performed on whole models and not on individual
model elements, which discourages this integration mechanism for scenarios where tools should concurrently and interactively operate on shared data.
For example, if transformations are employed to realise
the scenario from Sect. 2 one has to wait for the graph
analysis to finish, before the rendering tool can draw state
machines including the red colouring for elements that are
erroneous. If both tools would share data physically, the
analysis could run in parallel triggering a repaint for the
incorrect elements once the analysis has finished.
R4 Tool Interaction When transformations are used
for tool integration the interrelation of shared data and tool
specific data is implicitly defined in the transformation. Due
to this non-invasive data integration it is hard to track relationships of shared and the according tool-specific data
which impedes multilateral tool interaction.
To comprehend this drawback, consider a tri-lateral interaction of state machines, graphs and 2D rendering. First
a state machine is transformed to a graph representation.
Once the resulting graph is analysed, the problematic node
can be tagged invalid. But, to render invalid states in a red
colour, the rendering tool needs to know which state was
transformed to a node that was tagged invalid. Tracking
and using the relation between states and nodes in the rendering tool unnecessarily complicates the realisation of such
tool interaction.
3.3
Literal
literals *
Type
Enum
type
enums*
Role Model
roles *
Role
PrimitiveType
RoleFeature
* roleFeature
Attribute
Reference
attributeType
Figure 4: Language for Role-based Metamodelling
specific abstractions. In the following we will introduce an
approach for role-based tool integration which combines the
benefits of proactive and retroactive integration by providing both means for tool independence and tool interaction.
4. ROLE-BASED TOOL INTEGRATION
To overcome the limitations outlined above and to increase flexibility of tool interoperability, we propose to use
role models as an extension of current object-oriented metamodelling. The concept of roles has been originally presented in [18] and applied to the design of frameworks in [19].
Conceptually, a role model captures an area of concern [18].
This motivates the application to achieve tool independence.
On the other hand, role modelling introduces the technique
of role composition [2] to integrate several role models and
object-oriented system specifications to an interacting system implementation.
In the following we will introduce role-based metamodelling and elaborate how role modelling and role composition
helps to achieve tool independence and interaction, respectively.
Conclusion of Problem Analysis
4.1
In Fig. 2 we concluded the most important characteristics
of both integration approaches w.r.t. the requirements defined above. Given this characterisation one can conclude
that no approach satisfies all requirements. While proactive tool integration is more appropriate for shared data and
tool interaction, retroactive integration better addresses the
need for tool independence and does not interfere with tool-
Role Models for Tool Independence
In [18] a role model is defined to capture an “archetypical
pattern of objects” that enables the specification of a specific
concern of system behaviour. Thus, role models can be used
to define a specific abstraction to implement a particular
tool. Each individual role defines a type of object required
to achieve the tools functionality.
55
2D Shape Renderer
Shape
x, y, size: Integer
label: String
models defined in our example. For the 2D Shape Renderer
both Transition and State play the role Shape. From this
example we see that roles can be bound by several role players. Likewise a role player can be bound to several roles. To
integrate the state machine metamodel with the metamodel
of the Graph Analysis Tool, Transitions are bound to the
Edge role and States are bound to the Node role. These
bindings are completed by the role composition specification depicted in Listing 1.
The language used to specify these bindings can be either
declarative or imperative (e.g., using Java). Depending on
the implementation of the role bindings (see Sect. 4.3), the
bindings need to be translated to the target platform of the
integration or interpreted at runtime.
Up to now, the physical representation of shared data has
not been specified. To do so, we introduced the concept of
RoleGrounding (cf. Fig. 6) to role composition. Grounding
is used to identify roles and role features that physically
materialise data. In contrast to proactive integration where
newly developed tools need to adapt to an existing and fixed
data materialisation or retroactive integration where data is
replicated and materialised in various representations, this
allows to postpone the decision on data materialisation until
tool integration time.
Graph Analysis Tool
Kind
CIRCLE
kind RECTANGLE
LINE
Node
invalid:Bool
colour
Colour
source
Edge
target
WHITE
BLACK
RED
Textual State Machine Editor
State
name: String
Transition
from
condition: String
to
StateType
type
PLAIN
INITIAL
FINAL
Grounding Notation
Name
name: Type
Grounded Role Grounded Attribute
Binding Notation
name
Grounded Reference
Role Binding
Figure 5: Example of Role Composition for Tool
Interaction
integrate statemachine , 2 dShapes , graph {
State plays Shape {
label : name
kind : if ( player . type == PLAIN )
return RECTANGLE
else return CIRCLE
colour : if ( player . type == INITIAL )
return WHITE
else return BLACK
}
Fig. 3 depicts a role-based version of the tool metamodels
introduced in Sect. 2. Instead of concrete class-based data
structures these metamodels use role types for all concepts
of a tool metamodel. To model data required to implement
tool functionality, role types can define role attributes that
provide primitive values or references that connect several
role types. Role models can, thus, be considered as a mechanism for defining the required interface of a specific tool.
Compared to current metamodelling approaches role models do not bind this interface to a concrete data source or
representation. Thus, in contrast to using interfaces and
classes, roles do not distinguish between data that is physically represented and data that is obtained from another
object. This distinction is made by the grounding specification, which will be discussed soon.
As depicted in Fig. 3 for the 2D Shape Renderer, we use
role features to specify that every player of Shape role needs
to provide a label and the kind of shape it is represented by.
In the metamodel of the Graph Analysis Tool role references
are used to provide the source and target Node of an Edge.
The role-model metamodelling language used to define the
metamodels is depicted in Fig. 4. It provides means to define RoleModels consisting of several Roles. Each role contributes a set of RoleFeatures that are either Attributes
with a PrimitiveType or references with a complex Type.
In addition to Roles, Enums are allowed as complex Types.
4.2
Transition plays Shape {
label : condition
kind : return LINE
colour : return BLACK
}
State plays Node {}
Transition plays Edge {
source : from
target : to
}
ground State
{ name , type }
ground Transition
{ condition , from , to }
}
Listing 1: Role Composition Specification
Fig. 5 and Listing 1 depict the RoleGroundings and RoleFeatureGroudings used to specify data materialisation for
our example. The example shows that grounding complements the flexibility of role binding as it allows for an integration-specific adjustment of data materialisation. Data
that shall be used by both the Graph Analysis Tool and
the Renderer is materialised and shared using the roles in
the state machine role model. Data which is only required
within a specific tool (e.g., the x-, y, and size-coordinates
of a Shape) is materialised in the respective role model.
Such flexible means to specify data sharing and material-
Role Composition for Tool Interaction
Tool interaction can now be achieved by specifying a RoleBinding relationship between Roles of several tool RoleModels (cf. Fig. 5 and 6). Each role binding connects a role
of one role model to the role it plays in another role model.
For each role binding RoleFeatureBindings can be used by
the role player to bind attributes and references of the role.
This makes role composition suitable for the specification of
data sharing.
Fig. 5 depicts the bindings used to integrate the tool meta-
56
role-composition
Role Binding Implementation
RolePlayer plays Role {
roleFeature:
playerFeature
}
groundings *
RoleGrounding
RoleFeatureGrounding
* featureGroundings
GenericRoleTypeInterface
hasRoleType()
getRoleByType()
* bindings
Composition
RoleBinding
RoleFeatureBinding
Role
roleFeature: Type
* featureBindings
role-model
models * role player
Role Model
Role
roles *
role
RoleTypeInterface
getRoleFeature()
setRoleFeature()
binds
RoleFeature
* roleFeature
RolePlayer
playerFeature: Type
grounds
role
getRoleFeature()
setRoleFeature()
Figure 6: Composition Language for Role-based
Metamodelling
getPlayerFeature()
player setPlayerFeature()
getRoleFeature() {
return player.getPlayerFeature();
}
setRoleFeature(value) {
player.setPlayerFeature(value);
}
isation redeems tool developers from the challenge of anticipating extensibility required for future tool integration and
avoids problems of data replication and synchronisation.
4.3
RolePlayer
RoleTypeImpl
Grounding Implementation
ground Role
{ roleFeature }
Implementing Role Composition
Role
roleFeature: Type
Up to now, the languages to specify both the data abstractions used by tools (i.e., the role models) and the possible
interactions across them (i.e., the role bindings) have been
presented. The former specification did also include the decision about which data needs to be represented physically
(i.e., the grounding).
To actually use role-based metamodelling for tool interoperability, the question how to implement both role bindings
and groundings needs to be answered. As there is no single
answer to this, because roles can be implemented in various
ways (see [20] for an overview), we will present one feasible solution here. The aim of this is to show that the presented approach can easily be implemented based on classic
object-oriented technology. It does not raise the claim to be
a universal solution.
Our mapping of roles, role bindings and groundings is depicted in Fig. 7. The most simple mapping is the one for
grounded roles and grounded features (i.e., attributes and
references). These are basically mapped to plain classes and
features respectively. As grounded roles and features are
selected to be the part of the role model that is physically
represented, this mapping is straight forward.
To explain the mapping of role types, one must keep in
mind, that role types can only exist in collaboration with
one or more role players, where each player is connected to
the role with a role binding. We map this relation to objectoriented models as shown in Fig. 7. For each role type a dedicated interface (RoleTypeInterface) is introduced, which
defines the features of the role. This interface is a supertype
of both the implementation of the role type (RoleTypeImpl)
and the player type. The latter inheritance relation reflects
the fact that players need to provide all features that are
expected by clients of the role. This pattern is quite similar
to the Role Object Pattern [8]. We just added an additional
interface GenericRoleTypeInterface to provide reflection
facilities which can be used to check whether objects play
a certain role (hasRoleType()) or obtain roles played by an
object by role type (getRoleByType()).
The pattern can be applied on a pair of types in both
RoleTypeImpl
roleFeature: Type
getRoleFeature()
setRoleFeature()
Figure 7: Patterns for Object-oriented Implementation of Role Binding and Grounding
directions. Consider for instance a pair of types A and B,
then it is possible to have one role bindings where A is the
player and B is the role and a second binding where B is
the player and A is the role. Only the bound role features
need to be distinct and free of cycles. Consequently, our
appoach does not distinguish between an integrated and an
integrating metamodel, but supports different directions for
individual bindings and, thus, flexible data propagation in
both directions.
The role binding specification presented before (cf. Listing 1), is used to fill the implementation of the RoleTypeImpl class. If the role binding is a declarative one (e.g.,
label: name), appropriate code needs to be generated to
delegate calls to the methods getLabel and setLabel to
getName and setName respectively. For bindings that use
the imperative style (e.g., the binding for the colour feature)
the code needs to be translated to the target implementation
language.
Depending on the style of the binding (i.e., either declarative or imperative), role bindings can be information preserving or not. Whether this is a requirement depends on
the concrete domains that are bound to each other. Additionally, if the imperative style is used, one should make
sure that get and set operations are inverse to each other.
A complete mapping of all role bindings, feature bindings and groundings results in a plain object-oriented model.
Such a model implements the integration defined by the role
model and the role binding specification. Moreover, this
object-oriented model can be derived fully automatically. If
a different tool integration is needed, the role binding and
the grounding can be changed, leaving the involved tools
57
untouched.
To obtain a sound integration of all tools, all role types
and features must be either bound or grounded. This can
easily be explained by the fact that the data required by
tools must be either derived—by evaluating role bindings—
or available in materialised form—in accordance to groundings.
It is important to realise that even though the implementation sketched above is based on inheritance, it is different
from plain inheritance-based approaches to interoperability.
The patterns that employ inheritance to achieve integration
are automatically derived in our approach, whereas using
inheritance “as is” does not enforce tool developers to apply
the patterns correctly.
integrate statemachine , 2 dShapes , graph {
State plays Shape {
colour :
if ( player . hasRoleType ( Node ) &&
player . getRoleByType ( Node ). isInvalid ())
return RED
else if ( player . type == INITIAL )
return WHITE
else return BLACK
}
}
Listing 2: Reflection on Role Bindings
5. RELATED WORK
4.4
Tool integration and interoperability is a broad field and
received a lot of attention in the literature (see [26] for
an exhaustive survey). With the increasing popularity of
model-based technology, the problem has been tackled from
a different angle. Armed with tools such as standardised
metamodelling languages (e.g., EMOF), unified data formats (e.g., XML Metadata Interchange (XMI) [24]) and
model transformation languages (e.g., Query/View/Transformation (QVT) [25]), substantial improvements have been
made.
Transformation-based approaches have been employed to
synchronise various kinds of data [10, 22]. In principle any
model transformation language can be employed for this
task. However, some transformation languages are more
suitable than others. For example, a very prominent approach in the field (i.e., Triple Graph Grammars [17]) was
implemented in the MOFLON tool suite [1] and used to integrate heterogeneous tools. Furthermore, high-level specification languages were proposed to ease the specification of
relations between domain models [12].
All these transformational approaches do however still suffer from the drawbacks mentioned before. Data is replicated
and tools cannot operate concurrently on the same data.
Both drawbacks do not apply to role-model based tool integration, which is an advantage of our approach. Nonetheless,
transformations are a good choice in the presence of existing
metamodels. If the domain abstraction used by a tool has
not been captured in a role model and can not be changed
anymore, our approach is not applicable.
Metamodel integration approaches do import and reuse
metamodels to establish interoperability rather than transforming data required by heterogeneous tools. For example,
in [3] a common metamodel is proposed to establish tool
interoperability. While this is a feasible approach if one is
focused on a specific, closed domain, it strongly couples the
involved tools the common metamodel.
The classical mechanisms for reuse (i.e., delegation and
inheritance) have been subject to an analysis in this paper.
Based on these two mechanisms, a pattern that allows to
create metamodels that are extensible was presented in [7].
This approach focuses on the level of metamodels (M2). It
is similar to our approach, but we do use roles as a first-class
concepts during modelling, instead of emulating it using a
modelling pattern. Using a dedicated concept for integration makes interfaces explicit and avoids errors compared to
applying a pattern.
Besides the classical mechanisms for extension and reuse,
several others were proposed in the past [11]. Not all these
Contributions of Role-based Metamodelling
In the following we will conclude the contributions of rolebased metamodelling w.r.t. the requirements for tool implementation and integration defined in Sect. 3.
R1 Appropriate Abstraction Role models are meant
to decompose systems into units of concerns. Thus, our rolebased metamodelling approach satisfies the requirement to
provide data abstractions customised for tool specific needs.
As role composition contributes advanced means to integrate
different role models the need for tool interaction does not
interfere with the design of tool specific abstractions.
R2 Tool Independence Each role model can be specified independent of other role models. Consequently, tools
can be implemented independently. Role composition is
performed in a separate step and provides a technique to
loosely couple tools. This resolves the tool interdependence
issues experienced with proactive integration approaches.
Thus, changes (renaming, extension, refactorings) that result from the evolution of any involved tool can be supported
by adapting the composition specification and do not interfere with the metamodels used in other tools.
R3 Shared Data Role composition takes several role
models and provides means to compose them to an integrated system specification. The composition process integrates role players and the roles they play in accordance
to the role feature bindings. These bindings implement
the desired data sharing between the role models of several
tools. Role grounding provides means to precisely and flexibly specify physical data representation at integration time.
Data synchronisation is supported in any direction and duplication as found in retroactive integration approaches is
prohibited.
R4 Tool Interaction Role composition can be used to
share data and, thus, enable interaction of the corresponding
tools. Furthermore, our role composition approach allows to
reflectively access the role bindings of a role player which allows for more advanced forms of tool interaction. Consider
for instance the interaction of the 2D Shape Renderer and
the Graph Analysis Tool introduced previously: Every time
the Node a State plays is marked invalid by the graph analysis tool the corresponding Shape should be drawn in a RED
colour. This interaction can be achieved by the role composition defined in Listing 2 that is based on the reflection
facilities introduced by the pattern in Fig. 7. This leverages multilateral tool integration while still preserving the
independence of each individual tool.
58
view on data required by tools. These views are bound to
other views or physical data representations at tool integration time. Thus, decisions about interoperability, which
were previously made by tool developers are left to the tool
integrators. This increases the degree of freedom at integration time and therefore eases establishing tool interoperability.
The use of role models to specify tool interfaces decouples
the specification of the data required by a tool, from the
binding of this interface to a concrete data source or representation. The latter is established by role bindings, which
are created by the tool integrator, as she is the one, who
knows which data needs to be shared by which tools. Still,
designing metamodels using roles is similar to using classic
object-oriented metamodelling facilities.
In the future, further investigations are needed to prove
the feasibility of the approach in industrial scenarios. An
implementation based on the Eclipse Modeling Framework
(EMF) [21] is planned. This will allow to use the existing
EMF infrastructure and to test integration of various tools
that are based on EMF. Based on such an implementation,
the validation of the role compositions to leverage the reliability of composition specifications could be performed.
Also, the full potential of model-based specifications of tool
integrations can be exploited. While we do not foresee conceptual problems in this regards, the implementation and
validation on a representative case study need to be performed. In this paper we sketched the idea and illustrated
its benefits on a small toy example only.
Another important issue that needs to be addressed is the
migration of existing data. Once the landscape of integrated
tools changes (e.g., new tools are included), the grounding of
data may change and existing data needs to be transferred
to the new physical representation. To achieve such a migration an analysis of the changes made to the role bindings
and grounding is needed.
The main limitation of the role-based metamodelling approach is that it needs to be applied during tool design time
to prepare tools for interoperability. It can not be applied
as it is to integrate existing tools. Nonetheless, we strongly
believe that the benefits gained at tool integration time outweigh enforcing tool developers to design their tools to allow for interaction. The fact that the proposed modelling
approach anticipates all potential future integration scenarios by construction frees tool developers from the burden of
anticipating the unanticipated.
mechanisms have been investigated with regard to tool interoperability. To the best of our knowledge, this paper is
the first to investigate how to use roles as metamodel integration concept. Evaluating other mechanisms is subject to
future work.
A general problem related to the integration of metamodels is their semantics or their relations in general. To find
correct mappings between metamodels, the use of ontologies
has been proposed [16]. Also, to capture relations across
metamodels, mega models can be used [5]. While ontologies
may certainly help in identifying common or related concepts, we left this task to the tool integrator. This derivation
of metamodel mappings based on ontologies, is independent
of its realisation using a particular integration technique.
Depending on the chosen technique (e.g., transformations
or metamodel integration) the resulting integrations do still
suffer from the drawbacks identified in Sect. 3.
If tool integration spans multiple technological spaces,
which requires to align the concepts of the spaces first (M3),
before aligning the concepts of tools that reside in one particular space. An example for such an integration can be
found in [6]. Here, tools participating in an interoperability scenario, need to provide explicit metamodels, which are
then aligned to each other. In contrast, we assume tools to
reside in the same technological space, but focus on flexible
adaptation of tools with different domain abstractions.
The integration of heterogeneous tools and the adaptation
to different repositories has also been tackled by the Modelbus project [14]. However, the project aimed at providing
infrastructure to integrate tools based on existing technology (e.g., model transformation languages). This can ease
the task of tool integration to some degree, but the problem of integrating metamodels and creating transformation
specifications is not resolved by this infrastructure.
After all, the separation of tools (i.e., their functionality)
and the data they process (i.e., their repository) is a crucial
point within the topic of tool integration. This has been
coined in [4] as the service vision. The work presented in
this paper is along this line and may therefore be conceived
as one approach to model tools as services.
6. CONCLUSION
This paper was motivated by the need to integrate tools
in order to reuse their functionality. To allow such reuse,
tools must exchange data that is relevant to perform certain
tasks. While standardised metamodelling languages have
made this substantially easier, there are still plenty of open
problems. To identify some of them, we have carefully analysed existing techniques for sharing and synchronising data
based on models (i.e., object-oriented metamodel integration
and model transformation).
We have identified drawbacks for both approaches. While
proactive tool integration does not allow tools to be independent of each other and interferes with tool-specific abstractions, retroactive tool integration using transformations
does not support data sharing and impedes multilateral tool
interaction. In the latter case, interoperability was not anticipated at all, while in the former it is anticipated only to
a certain degree. This limited anticipation of tool interoperability at tool design time is what makes the lives of tool
integrators so hard.
To explicitly anticipate arbitrary future interactions between tools, we propose to use role models to specify the
7. REFERENCES
[1] C. Amelunxen, F. Klar, A. Königs, T. Rötschke, and
A. Schürr. Metamodel-based Tool Integration with
MOFLON. In W. Schäfer, M. B. Dwyer, and
V. Gruhn, editors, ICSE, pages 807–810. ACM, 2008.
[2] E. P. Andersen. Conceptual Modeling of Objects: A
Role Modeling Approach. Ph.D. Thesis. Oslo, Norway,
University of Oslo, 1997.
[3] A. Baumgart. A common meta-model for the
interoperation of tools with heterogeneous data
models. In Hein and Wagner [15].
[4] J. Bézivin, H. Brunelière, J. Cabot, G. Doux,
F. Jouault, and J.-S. Sottet. Model Driven Tool
Interoperability in Practice. In Hein and Wagner [15].
[5] J. Bézivin, F. Jouault, and P. Valduriez. On the Need
for Megamodels. In Proceedings of the
59
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
OOPSLA/GPCE: Best Practices for Model-Driven
Software Development workshop, 19th Annual ACM
Conference on Object-Oriented Programming,
Systems, Languages, and Applications, 2004.
H. Bruneliere, J. Cabot, C. Clasen, F. Jouault, and
J. Bézivin. Towards Model Driven Tool
Interoperability: Bridging Eclipse and Microsoft
Modeling Tools. In T. Kühne, B. Selic, M.-P. Gervais,
and F. Terrier, editors, ECMFA, volume 6138 of
Lecture Notes in Computer Science, pages 32–47.
Springer, 2010.
S. Burmester, H. Giese, J. Niere, M. Tichy, J. P.
Wadsack, R. Wagner, L. Wendehals, and A. Zündorf.
Tool Integration at the Meta-Model Level within the
FUJABA Tool Suite. International Journal on
Software Tools for Technology Transfer (STTT),
6(3):203–218, August 2004.
D. Bäumer, D. Riehle, W. Siberski, and M. Wulf. The
Role Object Pattern. In Proceedings of the 4th
Pattern Languages of Programming Conference
(PLoP’97), Washington University Dept. of Computer
Science, Tech. Report (wucs-97-34), 1997.
H. Ehrig, K. Ehrig, A. Habel, and K.-H. Pennemann.
Constraints and Application Conditions: From Graphs
to High-Level Structures. In H. Ehrig, G. Engels,
F. Parisi-Presicce, and G. Rozenberg, editors, ICGT,
volume 3256 of Lecture Notes in Computer Science,
pages 287–303. Springer, 2004.
K. Ehrig, G. Taentzer, and D. Varró. Tool Integration
by Model Transformations based on the Eclipse
Modeling Framework. EASST Newsletter, June 2006.
M. Emerson and J. Sztipanovits. Techniques for
Metamodel Composition. In OOPSLA – 6th
Workshop on Domain Specific Modeling, pages
123–139, October 2006.
M. D. D. Fabro, J. Bézivin, and P. Valduriez.
Model-Driven Tool Interoperability: An Application in
Bug Tracking. In R. Meersman and Z. Tari, editors,
OTM Conferences (1), volume 4275 of Lecture Notes
in Computer Science, pages 863–881. Springer, 2006.
W. Fan and P. Bohannon. Information Preserving
XML Schema Embedding. ACM Transactions on
Database Systems, 33(1), 2008.
C. Hein, T. Ritter, and M. Wagner. Model-Driven
Tool Integration with ModelBus. In Workshop Future
Trends of Model-Driven Development, 2009.
C. Hein and M. Wagner, editors. 3rd Workshop on
Model-Driven Tool and Process Integration,
Co-located with ECMFA 2010, 16th June 2010, Paris,
France, 2010.
E. Kapsammer, H. Kargl, G. Kramler, T. Reiter,
W. Retschitzegger, and M. Wimmer. On Models and
Ontologies - A Layered Approach for Model-based
Tool Integration. In Proceedings of the Modellierung
2006 (MOD2006), pages 11–27, 2006.
A. Königs and A. Schürr. Tool Integration with Triple
Graph Grammars - A Survey. Electronic Notes in
Theoretical Computer Science, 148(1):113–150, 2006.
T. Reenskaug, P. Wold, and O. Lehne. Working with
Objects: The OOram Software Engineering Method.
Manning Publications, Greenwich, CT, 1996.
D. Riehle and T. R. Gross. Role Model Based
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
60
Framework Design and Integration. In Proceedings of
the 13th ACM SIGPLAN Conference on
Object-oriented Programming, Systems, Languages,
and Applications (OOPSLA ’98), pages 117–133, 1998.
F. Steimann. On the Representation of Roles in
Object-oriented and Conceptual Modelling. Data
Knowledge Engineering, 35(1):83–106, 2000.
D. Steinberg, F. Budinsky, M. Paternostro, and
E. Merks. Eclipse Modeling Framework (2nd Edition).
Pearson Education, 2009.
Y. Sun, Z. Demirezen, F. Jouault, R. Tairas, and
J. Gray. A Model Engineering Approach to Tool
Interoperability. In D. Gasevic, R. Lämmel, and E. V.
Wyk, editors, Software Language Engineering, First
International Conference, SLE 2008, Toulouse, France,
September 29-30, 2008. Revised Selected Papers,
volume 5452 of Lecture Notes in Computer Science,
pages 178–187. Springer, 2008.
The Object Management Group. Meta Object Facility
(MOF) Core Specification, version 2.0. Technical
report, January 2006.
The Object Management Group. Meta Object
Facility(MOF) 2.0 XMI Mapping Specification, v2.1.1.
Technical report, December 2007.
The Object Management Group. Meta Object Facility
(MOF) 2.0 Query/View/Transformation Specification.
Technical report, April 2008.
M. N. Wicks and R. G. Dewar. Controversy Corner: A
new research agenda for tool integration. Journal of
Systems and Software, 80(9):1569–1585, 2007.
J. Winkelmann, G. Taentzer, K. Ehrig, and J. M.
Küster. Translation of Restricted OCL Constraints
into Graph Constraints for Generating Meta Model
Instances by Graph Grammars. Electronic Notes in
Theoretical Computer Science, 211:159–170, April
2008.
Aligning Business and IT Models in Service-Oriented
Architectures using BPMN and SoaML
Brian Elvesæter
Dima Panfilenko
Sven Jacobi & Christian Hahn
SINTEF ICT
P. O. Box 124 Blindern
N-0314 Oslo, Norway
+47 22 06 76 74
DFKI IWi
Stuhlsatzenhausweg 3, Campus D3.2
D-66123 Saarbruecken, Germany
+49 681 85775 7777
Saarstahl
Bismarckstraße 57-59
D-66333 Voelklingen. Germany
+49 6898 10 3476
brian.elvesater@sintef.no
dima.panfilenko@dfki.de
{sven.jacobi | christian.hahn}
@saarstahl.com
Although SOA concepts, business models and service
technologies has been a hot topic the last few years, the alignment
of business and IT models still remain a challenge. Furthermore,
although modelling is now an integrated part of software
engineering approaches, standardised modelling languages to
support SOA has been lacking. SHAPE (Semantically-enabled
Heterogeneous Service Architecture and Platforms Engineering)
(ICT-2007-216408) (http://www.shape-project.eu/) is a European
Research Project under the 7th Framework Programme that has
developed an infrastructure for model-driven engineering (MDE)
for SOA with support for various technology platforms [1]. The
SHAPE technologies resolve around the new Service oriented
architecture Modeling Language (SoaML) specification [2] from
the Object Management Group (OMG). SoaML aims at providing
a common modelling language to business and system architects.
In the SHAPE project we have defined an MDE approach to SOA
that incorporates the use of business modelling formalisms such
as BPMN and provide mappings to SoaML to help the business
and system stakeholders to align their business requirements and
IT system implementations.
ABSTRACT
In this paper, we introduce the new Service oriented architecture
Modeling Language (SoaML) and describe how the language can
be used to align business models and IT models. In particular we
provide a mapping specification from BPMN models to SoaML
models.
Categories and Subject Descriptors
D.2.11 [Software Architectures]: Service-oriented architectures.
D.2.12 [Software Engineering]: Interoperability.
General Terms
Design, Standardization, Languages, Theory.
Keywords
Business modelling, service modelling, business and IT
alignment, BPMN, SoaML.
1. INTRODUCTION
This paper is structured as follows: In Section 2 we give an
overview of the SoaML language. Section 3 describes our
requirements, mapping rules and tool support for the business and
IT alignment between BPMN and SoaML. In Section 4 we
present an illustrative example taken from one of the industrial
use cases in the SHAPE project. Section 5 discusses our results
and findings. Finally, Section 6 concludes this paper.
There is an industrial interest in ensuring a good connection and
mapping between business models as expressed in enterprise
architectures and IT models as expressed in technical system
architectures, which are commonly realised as service oriented
architectures (SOAs). The increasing popularity of the SOA
paradigm relies on its closeness to business models, in particular
business processes. The concepts of SOA apply both to business
architectures as well as system architectures. From a business
perspective the SOA describes the business-critical processes,
contracts, information and capabilities of the enterprise. From an
IT perspective the SOA describes the software components, their
service interfaces and how these components can be coupled to
form a technical system architecture that supports the business
requirements of the enterprise.
2. SoaML
The Service oriented architecture Modeling Language (SoaML)
specification [3] defines a UML profile and a metamodel for the
design of services within a service-oriented architecture. The
goals of SoaML are to support the activities of service modelling
and design and to fit into an overall model-driven development
approach. The SoaML profile defines extensions to UML to
support the range of modelling requirements for service-oriented
architectures, including the specification of systems of services,
the specification of individual service interfaces, and the
specification of service implementations. This is done in such a
way as to support the automatic generation of derived artefacts
following an MDA based approach.
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that
copies bear this notice and the full citation on the first page. To copy
otherwise, or republish, to post on servers or to redistribute to lists,
requires prior specific permission and/or a fee.
MDI2010, October 5, 2010, Oslo, Norway.
Copyright 2010 ACM 978-1-4503-0292-0/10/10...$10.00.
According to the specification, SoaML has been designed to
support both an IT and business perspective on SOA. Our
61
experiences with the SoaML language, in the context of tool and
method implementation in the industrial use case, have suggested
that a clearer separation of the business-level and IT-level
concepts are needed. In the context of SHAPE we have made
these levels more explicit. Figure 1 illustrates the separation.
Business
Perspective
on SOA
Business
Processes
and Participants
Business Goals
Services Architecture
Capabilities
Service Contracts
Business and
IT alignment
IT
Perspective
on SOA
Service Interfaces
Interfaces
and Messages
Service
Choreographies
Components and Ports
Figure 2. UML extensions for business concepts
Figure 1. Business and IT concepts of SoaML
The language constructs from SoaML that are most suitable at the
IT level are service interface and its behaviour (i.e. service
choreography), interface, message type, components (i.e.
participants) and service and request ports (see Figure 3).
In the business perspective on SOA we suggest to integrate the
use of the SoaML language with the Business Motivation Model
(BMM) language [4] to define business motivation models and
the Business Process Model and Notation (BPMN) language [5]
to define business processes. Motivation models and business
processes are important aspects to be included when modelling
the business perspective on SOA. The SoaML specification
defines relationships to BMM and the BMM specification defines
relationships to BPMN which allows for this integration of
languages. In this paper we focus on the relation between BPMN
and SoaML in order to align process models from BPMN with
service models for SOA. The language constructs from SoaML
that are most suitable at the business level are participant,
services architecture, service contracts and capability (see Figure
2).
Service interfaces are used to describe the operations provided
and required to complete the functionality of a service. A service
interface can be used as the protocol for a service port or a request
port.
Service data are used to describe service messages and message
attachments. The message type is used to specify the information
exchanged between service consumers and providers. An
attachment is a part of a message that is attached to rather than
contained in the message.
Services architectures are used to define how a set of participants
works together for some purpose by providing and using services.
A services architecture describes how participants work together
by providing and using services expressed as service contracts.
It should be noted that some of the language constructs defined in
SoaML fit on both the business and IT level. In particular this
applies to participants that are used to define the service
providers and consumers in a system. At the business level the
participants typically represent business organization units or
roles, whereas on the IT level the participants typically represent
IT systems or software components. When a participant acts as a
provider it contains service ports, and when a participant acts as a
consumer it contains request ports.
Service contracts are used to describe interaction patterns
between service entities. A service contract is used to model an
agreement between two or more parties. Each service role in a
service contract has an interface that usually represents a provider
or a consumer.
SoaML is agnostic to the choice of modelling formalisms to
define behaviour. The specification states than any UML
behavioural constructs can be used to describe behaviour such as
service choreographies, but also other formalisms such as BPMN
can be used.
Participants are used to define the service providers and
consumers in a system. A participant may play the role of service
provider, consumer or both.
Capabilities represent an abstraction of the ability to affect
change. Capabilities identify or specify a cohesive set of functions
or resources that a service provided by one or more participants
might offer. Capabilities can be used by themselves or in
conjunction with participants to represent general functionality or
abilities that a participant must have.
62
UML, or BPEL, are used instead of EPCs. The reference models
provided by SAP are also defined using EPC methodology. EPCs
offer a variety of ways to analyse processes and identify both
quantitative and qualitative improvement options.
The Business Process Management Initiative (BPMI)
(http://www.bpmi.org/) developed an initial standard called
Business Process Modelling Notation (BPMN) that was adopted
by the OMG and renamed to Business Process Model and
Notation (BPMN) [5]. The primary goal of BPMN is to provide a
notation that is readily understandable by all business users, from
the business analysts that create the initial drafts of the processes,
to the technical developers responsible for implementing the
technology that will perform those processes, and finally, to the
business people who will manage and monitor those processes.
Thus, BPMN creates a standardised bridge for the gap between
the business process design and process implementation. Another
goal, but no less important, is to ensure that XML languages
designed for the execution of business processes, such as
BPEL4WS (Business Process Execution Language for Web
Services), can be visualised with a business-oriented notation.
Furthermore you have the possibility to create organisational
units. With pools and lanes you can manage your organisation
view of the process. Another aspect is that you are able to
communicate between pools and lanes.
In general the BPMN and SoaML models can be seen as different
architectural viewpoints on the enterprise model, and coupled to
the enterprise and information, and computational viewpoints
respectively from the Reference Model for Open Distributed
Processing (RM-ODP) [7-10]. Indeed, BPMN is focused rather on
the enterprise processes and information, whereas SoaML
primarily describes the structure of the service architecture. The
models we create with the BPMN and SoaML standards could be
seen as architectural viewpoints according to IEEE 1471 [11],
which suggests a viewpoint-based modelling approach for
supporting different stakeholders in the system development
process.
Figure 3: UML extensions for IT concepts
3. ALIGNING THE BUSINESS AND IT
PERSPECTIVES ON SOA
3.1 Requirements
For the support of the different roles in a collaborative modelling
project one can think of appropriate modelling formalisms. In the
aligning of the business and IT perspectives there are obviously at
least two roles that can be considered – business architect and
system architect. They both are experts in their area but are not
necessarily using the same notations for representing the same
concepts, For that reason the two formalisms are described in the
following that can be used by these respective users for
modelling.
3.2 BPMN to SoaML Mapping Rules
In this section the mapping rules for the model transformation
between BPMN and SoaML are presented. The challenge here is
in transforming BPMN models to SoaML in order to generate the
appropriate system relevant constructs for SoaML according to
the generic business context on the computation independent
model (CIM) level. The tool support for that is implemented
within CIMFlexMT (see Section 3.3), which supports in its initial
version the model-to-model transformation by making use of the
Atlas Transformation Language (ATL) [12]. First the simple oneto-one rules are presented and then patterns for recognizing the
SoaML service contracts are introduced.
Business users may use a business process modelling formalism
such as Event-driven Process Chain (EPC) [6] to represent their
workflows. Process chains describe the sequencing and
interaction between data, process steps, IT systems, organisational
structure and products. An EPC always starts and ends with
events, which define the state or condition under which a process
starts and the state under which it ends. An event may initiate
multiple functions at the same time; similarly, a function may
result in multiple events. To represent these branches and
processing loops in an EPC, a connector (or rule) is used.
However, instead of acting simply as graphical connections, the
connectors also define the logical links between objects, such as
“and” or “either/or.” EPCs are typically used at the higher levels
of the process hierarchy. If more technical details of business
processes need to be described, other methods, such as BPMN,
Mapping Rule 0: Process to Services Architecture
A services architecture has components at two levels of
granularity: The community services architecture is a ”top level”
view of how independent participants work together for some
purpose. The services architecture of a community does not
assume or require any one controlling entity or process. A
participant may also have a participant services architecture,
which specifies how parts of that participant (e.g., departments
within an organization) work together to provide the services of
63
the owning participant. Participants that realize this specification
must adhere to the architecture it specifies.
Mapping Rule 3: Pool to Participant (Community-level)
A pool in BPMN stands for a business entity or a participant of a
process, on the one hand. It also can be structured with respect to
further participants of the process, thus creating a participants’
hierarchy. These two points together to map the pool onto a role
in a community-level services architecture that has a participant
type matching the pool. Table 3 illustrates the mapping of the
notation.
The services architecture is aligned with the business process, and
the participants and service contracts can be derived from the
pools or lanes and activities in the business processes respectively
following these guidelines:
•
•
Identify public and collaborative business processes that
involve interactions and potential usage of software services
between different business organizations. These processes
are candidates for public community-level services
architectures in SoaML that describe the service contracts
between the business organizations.
Table 3. Pool to Participant (Community-level)
BPMN
Construct
Identify private business processes for the business entities
under your ownership control that are involved in the
services architecture under consideration. These processes
are candidates for private participant-level services
architectures in SoaML that describe the service contracts
between the internal organizational roles or units within the
business organization.
< < Part icipa nt > >
Po ol
< <Serv icesArc hit ec t ur e> >
XY
Notation
Ro le: Poo l
Mapping Rule 1: Task to UML Action
Mapping Rule 4: Lane to Participant (Participant-level)
A task describes an activity that is possibly providing a useful
output that could be consumed by the participants of the process.
It can be then mostly closely assigned to an action construct in
UML as it gives the abstract interface for the job done and at the
same time does not give further specification of the workflow
implementing this task. In the CIM manufacturing example it
means all three Tasks “Prepare Order”, “Purchase” and “Receive
Order” are mapped to actions. Table 1 illustrates the mapping of
the notation.
A lane represents a participant or a department in BPMN and is
situated in a pool, thus showing the two-tier hierarchy. In order to
show the possibility for further subdivision (which is also ongoing
in the current BPMN2 proposals), the lane is mapped to a role in a
participant-level services architecture that has a participant type
matching the lane. The participant-level services architecture
must adhere to the community-level services architecture for
which the corresponding pool participants (see rule 2) belongs.
Table 4 illustrates the mapping of the notation.
Table 1. Task to UML Action
Construct
Pool
SoaML
Participant
Role in a Community-level
Services Architecture
Table 4. Lane to Participant
BPMN
SoaML
Task
Action
Construct
Notation
BPMN
SoaML
Lane
Participant
< < Part icipa nt > >
< < Part icipa nt > >
Lane 1
Lane 2
< <Serv icesArc hit ec t ur e> >
Poo l
Notation
Mapping Rule 2: Sub-Process to Services Architecture
Role1: Lane1
A sub-process represents a more complex process than a simple
task, but still can be seen as a whole. It can be assigned to a
lower-level, e.g. participant-level services architecture that
details the roles and tasks of the sub-process. It should be
mentioned, though, that this services architecture is not
necessarily the bottom level and can be subdivided further
(through roles). Table 2 illustrates the mapping of the notation.
Mapping Rule 5: Message “Begin” to Service
The beginning point of each and every message in BPMN has the
following semantics – it should be the starting end of the data
channel between two participants or pools. This exact meaning
also has the service port in SoaML, which finds its accordance in
this mapping point. The participants in SoaML are using this
construct in order to provide services for other participants in the
modelled architecture. Table 5 illustrates the mapping of the
notation.
Table 2. Sub-Process to Services Architecture
Construct
BPMN
SoaML
Sub-process
Services Architecture
Role2: Lane2
< <Serv icesArc hit ec t ur e> >
XY
Notation
64
Table 7. Process fragment (pattern) to Service Contract
Table 5. Message “Begin” to Service
Construct
BPMN
SoaML
Message “Begin”
Service
Construct
BPMN
Lane1
Lane2
SoaML
Service
Contract
< <int er face> >
Lane1_T ask1_I nt er fac e
Notation
< <int er face> >
Lane2_T ask2_I nt er fac e
Notation
< <Serv iceCont ract > >
Lane1_Lane2
Lane1_Role: Lane1_T ask1_Int erface
Mapping Rule 6: Message “End” to Request
Lane2_Role: Lane2_T ask2_Int erface
The ending point of each and every message in BPMN has the
semantics that looks very alike with the message beginning point,
but is situated on the other end of the communication channel.
The similar semantics of the request port in SoaML offers this
construct to be mapped to the messaging end from the BPMN.
The aim of this mapping is the reflexion of the data channel target
in the service consumption of the modelled architecture. Table 6
illustrates the mapping of the notation.
3.3 Tool Support
In last section we provided mapping rules of the high CIM-level
service modelling with the aid of the BPMN. This notation is
well-known and established since the beginning of the 21st
century, moreover it has been standardized and there are more
than 50 products, both commercial and open-source, providing the
implementation of this standard [13]. The particular
considerations with respect to modelling services by the business
users are that there is a little awareness of the services by CIMlevel users, on the one hand, and even if there would be any
knowledge about it, there are no direct constructs describing the
services on the CIM-level in the BPMN notation anyway. Of
course the upcoming BPMN 2.0 [14] standard includes the
services modelling and the according constructs for it, but it only
rules out the second, more technical problem, and not the first one
– understanding.
Table 6. Message “End” to Request
Construct
BPMN
SoaML
Message “End”
Request
Notation
Mapping Rule 7: Process fragment (pattern) to Service
Contract
For the solution of this problem we propose a semiautomated approach in this section based on a model-to-model
(M2M) transformation from CIM-level BPMN models to PIMlevel SoaML-based models. Those models on the higher
abstraction level in BPMN would be analysed through a set of
mapping rules and would result in a service model representing
according constructs and architectures needed for the
comprehensive PIM-level model as a basis for the further
transformation to the PSM-level. The further section content
comprises the manufacturing example and the mapping rules
identified and needed for the services mapping from CIM- to
PIM-level models. In addition there are technical details of the
transformation presented for the BPMN to SoaML mapping set
giving a short insight into the serialisation of the models during
transformation.
There is no single construct in BPMN that resembles a service
contract. You need to analyze the BPMN processes and identify
process fragments that can be mapped to service contracts. A
service contract defines a service specification that defines the
roles each participant plays in the service, and the interfaces they
implement to play that role in the service. We can however,
define a pattern of BPMN constructs that can be mapped to a
service contract.
As an example of the technical solution we consider the
mapping rule 7 for service contract (also see Figure 4). In the
following we show how the pattern identified for the recognition
of the service contract on the CIM-level is technically
transformed into the corresponding PIM-level construct. We
consider the specific function names in ATL transformation file
out of scope and concentrate on the XML representation of the
source and target models. Through the rule 7 eight objects of the
BPMN model are being translated into six objects of SoaML
model (see Table 8). The graphical representation of the SoaML
input models is taken from the SoaML Editor developed in
SHAPE project.
Figure 4. Rule 7 Transformation Pattern
The pattern (see Figure 4) describes a task sequence connected by
a sequence flow, but the participants are represented through
different lanes in the same pool. The two tasks that belong to a
service contract also share a data object. Table 7 illustrates the
mapping of the notation.
65
the domain user and especially the business analysts. From an
architectural point of view the component has two
interdependencies with other components for its output. The
information, which is required for the creation of a CIM model,
will be derived from the use cases by the domain users. The
output of the CIM level editor can have two different forms
depending on its purpose. On the one hand, a model on CIM level
in BPMN notation can be used as the technical information
description draft, giving a starting point for the transformation
into BPEL for further execution of the resulting model or the
enrichment with further technical information. On the other hand
the output of the CIM level editor is the starting point for the CIM
to PIM transformation. In this case the editor does not provide the
models in BPMN notation, but transforms them into SoaML
models. The conceptual and technical details of this
transformation are described in the Section 3. The prototypes of
this transformation are available on the SHAPE website
(http://www.shape-project.eu/).
Table 8. Transformation XML mapping
Lane1
BPMN
SoaML
Property Lane 1, Dependency1
Lane2
Property Lane 2, Dependency2
Association1
-
Association2
-
SequenceFlow
-
Task1
-
Task2
-
DataObject
<<ServiceContract>>,
<<Collaboration>>,
<<CollaborationUse>>
For the technical realisation of the transformations following
agreements are valid in the ATL transformation implementation
•
•
•
Figure 5 depicts the Saarstahl Manufacturing example modelled
in the CIMFlex editor, which partly implements the BPMN
notation. There is a pool named Manufacturing representing the
cooperation between two counterparts of the process, namely
Customer and Manufacturer represented by BPMN lanes. The
starting event is followed by a BPMN task on the Customer side
fulfilling the purpose of order preparation. As soon as the Order
represented by a BPMN data object reaches the Manufacturer, it
performs a purchase operation and leads the way to the receiving
order by the Customer. The process ends with a BPMN end event.
In the following we apply a set of mapping rules to illustrate the
transformation from this BPMN model to SoaML.
CollaborationUse
is
an
element
of
a
ServicesArchitecture. The Properties are elements of a
ServicesArchitecture as well. Dependencies are
assigned CollaborationUse as children. (One can see the
hierarchy graphically in the SHAPE SoaML Editor).
The directions in which Associations are showing are of
no importance, they should only connect the two Tasks
in the different Lanes with a DataObject.
The objects possess hierarchy structure relations, in
particular CollaborationUse containts a reference to the
according ServiceContract, Properties a reference to the
according Participant, Dependencies a reference to the
according Properties.
The transformed ServiceContract element according to the rule 7
can be seen in SHAPE SoaML Editor, which shows not only the
structure of the transformed element and accompanying relations
and properties but also the SoaML stereotypes applied during the
transformation (see Figure 5).
Figure 5. Hierarchy of objects in SHAPE SoaML editor
4. ILLUSTRATIVE EXAMPLE
The CIMFlex editor is a tool developed in SHAPE project. The
CIMFlex editor allows the user to create and refine a semi-formal
model of a business process, an organisational structure, a data
structure or business rules based on the input coming from the
domain users. The editor is able to create, change and store these
types of models in EPC or BPMN notation. As storage format
XML files are generated. The target users of this component are
Figure 6. Manufacturing process – input model
After the transformation application the following model would
emerge through the rules described before:
66
< < Part icipa nt > >
Cu st om er
service contracts. This is a business design choice which
ultimately depends on the people involved and how they best
understand the business operations.
< < Part icipa nt > >
Man ufa ctu re r
< <Serv icesArc hit ec t ur e> >
Manufac t ur ingArc hit ect ure
The overall approach presented by SHAPE is how to model your
processes starting on CIM, over PIM down to PSM yielding to
some system which reflects the processes described on CIM level.
For green field projects this ‘top-down approach’ might be a
suitable approach. In the Saarstahl use case they benefited from
improved practices for business and IT modelling to improve
communication and synchronisation between business
requirements and IT solutions. However, Saarstahl also noted that
most companies have already an existing IT landscape and
running systems modelling their processes. A reverse engineering
or bottom-up approach should be investigated to cover this
missing part.
c ust omerPart : Cust omer
c ust omerRole
o rd erin g: Man u fa ct u rin g Co n t ra ct
manufac t ur erRole
ma n uf act u re rPar t : Ma n uf act ure r
< <int er face> >
Cust om er Int er face
< <int er face> >
Manufac t ur er Int er face
6. CONCLUSION AND FUTURE WORK
In this paper we have presented an overview of the SoaML
modelling language and its application for describing both a
business and IT perspective on SOA. Furthermore, we have
defined a set of model transformation rules that can be used to
map BPMN models to SoaML models. The application of these
mapping rules have been tested in industrial use cases in the
SHAPE project with the objective of aligning business and IT
models. The SHAPE technologies improved practices for business
and IT modelling and improved communication between business
requirements and IT solutions.
< <Serv iceCont ract > >
Manufac t ur ingCont r ac t
c ust omerRole: Cust omerInt erface
manufac t ur erRole: Manufac t ur erI nt erfac e
As we can see, the lanes constructs from the BPMN notation
example are translated into the participants constructs in SoaML
(rule 4). At the same time a pattern identified by the rule 7
translates the interaction between Customer and Manufacturer
into a service contract within the services architecture.
One aspect of our guidelines that requires further work is to
identify and describe additional patterns and guidelines for
mapping to service contracts. In particular better support for
multi-tier service contracts requires additional work. Furthermore,
the mapping rules defined must also be updated and aligned with
the ongoing BPMN 2.0 specification, which introduces some new
process and service language constructs.
5. DISCUSSION
7. ACKNOWLEDGMENTS
Figure 7. SoaML services architecture – output model
There is an industrial interest in ensuring a good connection and
mapping between business models as expressed in enterprise
architectures and IT models as expressed in technical system
architectures, which are commonly realised as service oriented
architectures (SOAs). The gap between these models is not trivial
to close and we believe this stems from the fact that this is not
only a technical task, but also one that requires collaborations and
decisions to be made by both business and system stakeholders.
Obviously, modelling guidelines, mapping rules and software
tools, as those developed in SHAPE, to model and execute semiautomated model transformations can be used in the alignment of
business and IT models, in particular for simple one-to-one
mappings.
This research was co-funded by the European Union in the frame
of the SHAPE FP7 project (ICT-2007-216408). The authors
would like to express their acknowledgments to SHAPE
colleagues.
8. REFERENCES
[1] M. Stollberg (ed.), "SHAPE Project Whitepaper", SHAPE
STREP, 9 June 2009. http://www.shape-project.eu/wpcontent/uploads/2008/01/shape_whitepaper.pdf
[2] OMG, "Service oriented architecture Modeling Language
(SoaML), FTF Beta 2", Object Management Group, OMG
Document
ptc/2009-12-09,
December
2009.
http://www.omg.org/spec/SoaML/1.0/Beta2/PDF/
However, for more complex mappings, as evident in the mapping
to service contracts, it is more of a business and IT design choice.
Although we have presented a pattern for identifying service
contracts from analyzing BPMN processes, the choice of which
tasks to include into a service contract is still not clear. This
relates to the service choreography that defines the behaviour of
the service contract. The issue is to include all tasks and all
interactions that make up a suitable choreography. This
choreography may include several interactions and passing of
messages across two or more pools in the case of multi-tier
[3] OMG, "Service oriented architecture Modeling Language
(SoaML), FTF Beta 1", Object Management Group, OMG
Document
ptc/2009-04-01,
April
2009.
http://www.omg.org/spec/SoaML/1.0/Beta1/PDF/
[4] OMG, "Business Motivation Model, Version 1.0", Object
Management Group (OMG), OMG Document formal/200808-02,
August
2008.
http://www.omg.org/spec/BMM/1.0/PDF/
67
[10] ITU-TS, "Basic Reference Model of Open Distributed
Processing - Part 4: Architectural Semantics", Rec.X904
(ISO/IEC 10746-4), 1995.
[5] OMG, "Business Process Model and Notation (BPMN),
Version 1.2", Object Management Group (OMG), OMG
Document
formal/2009-01-03,
January
2009.
http://www.omg.org/spec/BPMN/1.2/PDF
[11] IEEE, "IEEE Std 1471-2000: IEEE Recommended Practice
for Architectural Description of Software-Intensive
Systems", IEEE, October 2000.
[6] A.-W. Scheer and M. Nüttgens, "ARIS Architecture and
Reference Models for Business Process Management", 2000.
http://www.wiso.unihamburg.de/fileadmin/wiso_fs_wi/EPKCommunity/LNCS_Geschaeftsprozessarchitektur.pdf
[12] INRIA & LINA, "ATLAS Transformation Language (ATL)
Project
Documentation".
http://www.eclipse.org/gmt/am3/doc/ (last visited 2010).
[7] ITU-TS, "Basic Reference Model of Open Distributed
Processing - Part 1: Overview and guide to use the
Reference Model", Rec.X901 (ISO/IEC 10746-1), 1995.
[13] BPMI, "Current Implementations of BPMN", Business
Process
Management
Inititative
(BPMI).
http://bpmn.org/BPMN_Supporters.htm (last visited 2010).
[8] ITU-TS, "Basic Reference Model of Open Distributed
Processing - Part 2: Descriptive model", Rec.X902 (ISO/IEC
10746-2), 1995.
[14] OMG, "Business Process Model and Notation (BPMN), FTF
Beta 2 for Version 2.0", Object Management Group (OMG),
OMG
Document
dtc/2010-05-03,
May
2010.
http://www.omg.org/spec/BPMN/2.0/Beta2/PDF
[9] ITU-TS, "Basic Reference Model of Open Distributed
Processing - Part 3: Prescriptive model", Rec.X903
(ISO/IEC 10746-3), 1995.
68
Domain-specific Templates
for Refinement Transformations
Lucia Kapova⋆, Thomas Goldschmidt†, Jens Happe‡, Ralf H. Reussner⋆,†
⋆
Karlsruhe Institute of Technology (KIT), 76131 Karlsruhe, Germany
Email: {kapova, reussner}@ipd.uka.de
†
FZI Research Center for Information Technology, 76131 Karlsruhe, Germany
Email: goldschmidt@fzi.de
‡
SAP Research CEC, 76131 Karlsruhe, Germany
Email: jens.happe@sap.com
ABSTRACT
Keywords
Model transformations are a major instrument of modeldriven software development. Especially in declarative transformation approaches, the structuring of transformations
depends to a large extent on the structure of the source
models and the generated artefacts. In many cases, similar
code is written for transformations that deal with the same
source or target metamodel. Writing such transformations
can be simplified significantly if re-occurring parts within
the transformation rules can be specified in a reusable way.
Current approaches to transformation development include
means for transformation reuse as well as inheritance. However, modularisation along the boundaries of different parts
of domain metamodels is still lacking. Furthermore, the possibilities to reuse transformation fragments that re-occur in
multiple transformations is limited. In this paper, we introduce domain-specific templates for refinement transformations with well-defined variation points. Transformation
templates are based on known design patterns and enable
a modular specification of refinement transformations and
thus yield a simpler definition of transformations that can
be grasped more easily and developed more efficiently. In
addition, we present a real-world case study of transformation templates in the context of component based software
architectures. The case study gives insight into the application of the presented approach.
Software Architecture, Refinement Transformations, Templates, Higher-Order Transformations
1. INTRODUCTION
The OMG’s Model Driven Development (MDD) enables
developers to design and implement software systems at a
high level of abstraction. Routine work is delegated to tools
as far as possible in order to increase efficiency in software
development. Model transformations are the major instrument of model-driven software development for this purpose.
They are heavily used in the development and refinement of
models. The target model of a transformation is a refinement of the source model, if the transformation preserves
large parts of the source model and adds additional information. Such transformations are called refinement transformations [1].
Transformations are mainly determined by the source- and
target-domains on which they operate. Especially in declarative transformation approaches, the structuring of transformations depends to a large extent on the structure of the
source models and the generated artefacts. In many cases,
similar code is written for transformations that deal with
the same source or target metamodel. The re-occurrence of
domain-specific patterns for the creation of the refined model
leads to large parts of duplicated transformation code. This
holds especially for refinement transformations that mostly
operate on a single metamodel. Refinement transformations
often require an annotation phase [1, 2] in which software engineers attach information to individual elements of a model.
The annotations specify which elements are to be refined
by the subsequent transformation. Such annotations and
the underlying model are then transformed into a refined
model [1]. Writing such refinement transformations can be
simplified significantly if re-occurring parts within the transformation rules can be specified in a reusable way.
However, there is little experience available about how
to design and implement refinement transformations using
modern relational transformation languages. One reason for
this is the fact that model transformations are written in
model transformation languages of very recent date (e.g.
QVT Version 1.0 was published in 2008) [3]. Therefore,
a basis of formalised knowledge and experience with model
transformation development is not yet available at a broad
Categories and Subject Descriptors
D.2.12 [Software Engineering]: Interoperability;
I.6.5 [Simulation and Modeling]: Model Development
Modeling methodologies
General Terms
Design, Performance
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
MDI2010, October 5, 2010, Oslo, Norway.
Copyright 2010 ACM 978-1-4503-0292-0/10/10 ...$10.00.
69
basis. First initiatives for transformation design template
specification focused on generic patterns [4] for model transformations. Although these patterns define a foundation,
they do not exploit domain-specific knowledge of the transformation’s source and target models. For example, they
do not make use of design patterns that are often part of
software models.
This work is a reflection of our experience with the implementation of model transformations used for customized
software development. We build on previous work introduced in [5] where we presented an approach to develop architectural refinement transformations based on model annotations. These annotations express configurations, that
depend mostly on architectural design decisions. We found
out that model refinements resulting from these configurations follow certain patterns. Similarly, as each software or
its model is addressing a particular domain or combination
of them, these refinements address the domain defined by
a known metamodel. However, this refinements could come
from different domains and be expressed conform to different metamodels. To support interoperability between different domains and requirements on refinements originating
from different domains we have to deal with identification
of common refinement actions that could be expressed as
templates and mapped to the domain of origin. This way
we increase the level of abstraction in our automated generation process and allow to build refinement transformations
over an number of different domains. Towards this goal, we
have to build an library of templates and propose an method
supporting reuse of such transformation parts. In this paper, we address this issue and present an approach taking
advantage of this possibility to reuse transformation parts
and automate the development of refinement transformation
even further than proposed in [5]. We present parametrised
templates for the definition of domain-specific transformations. The templates increase the reusability of already formalised domain knowledge. Reusability and modularization
are the main advantages of transformation templates. Once
defined, transformation templates can be reused to create
new, configurable transformations for their respective source
and/or target models. The contributions of this paper are
(I) an extension to the transformation generation process
(see [5] for details) to support parametrised transformation
templates allowing easy variation of transformations, (II)
higher-order transformations for template instantiations and
(III) domain-specific templates for architectural refinements.
The remainder of this paper is structured as follows. An
example-driven overview on the process for defining and
integrating reconfigurable transformation templates is presented in Section 2. Section 3 introduces the template metamodel as well as the transformation generation process. Related work is discussed in Section 4. Section 5 concludes the
paper and presents future work.
Figure 1: Refinement Transformation Development
process is to automatically generate a refinement transformation based on certain design decisions, expressed as configurations. These configurations could express architectural
design decisions, such as the usage of the Message Oriented
Middleware (MoM) for communication, the usage of thread
pools, etc. In the following, we first introduce the basic
concepts needed.
Figure 1 provides a high-level overview of our approach.
The process yields a Refinement Transformation for a particular configuration. The resulting Refinement Transformation is a composition of a Frame and Custom Rules.
We use so-called copy transformations introduced in [6] as
Frame. Copy transformations have been motivated by the
lack of support for a higher-order operator to specify the
copying of whole sub-models in QVT-Relations [3]. In [6],
we used Higher-Order Transformation (HOT) to synthesise
default copy transformations in QVT-Relations for a given
metamodel. Consequently, the Frame is a transformation
that copies unchanged parts of a model. In this paper, we
go beyond the mere creation of a copy transformation and realise a transformation variation based on a configuration. In
our process, the selection of Custom Rules for a composition
depends on a configuration that allows software engineers to
customise a refinement transformation. We use feature models [7] to capture variabilities of refinement transformations
and specify valid combinations of features. An instance of a
feature model represents a specific (feature) configuration,
i.e., the actual configuration of a specific transformation. It
is defined by selecting or deselecting certain elements of the
feature model. The feature configuration determines the actually needed Custom Rules. In our approach Custom Rules
are implemented in QVT-Relations. The insertion of Custom Rules according to a given feature configuration is again
achieved by HOTs [5], composing Frame and Custom Rules.
HOTs are transformations that themselves operate on transformations. They are mainly used to generate (or transform)
transformation specifications. Providing HOTs for QVTRelations can be done elegantly based on its abstract syntax
model. QVT-Relations with its precondition and postcondition dependency network between mapping relations can be
understood quite well [8] when it comes to transformations
that create the abstract syntax model. In this case, relations
are used to generate the model of other relations. This way,
the complex refinement relations are generated.
In the following, we first give a running example for feature models and feature configurations and then further elaborate the concepts of transformation templates and their instantiation as Custom Rules.
2. REFINEMENT TRANSFORMATION
DEVELOPMENT PROCESS
In this section, we present an overview of the development
process of refinement transformation using domain-specific
templates. In general, the presented process is similar to
processes in product line engineering. Both share the common goals of reusability and customisability. Our process
is focused on reuse of process artefacts. The goal of the
70
Messaging Model Refinement
Messaging
Message
Channel
Generated
Transformation
Receiver
Sender
Message
Sender
Adapter
Message
Receiver
Adapter
IMessageReceiverAdapter
IConsumerPoolRequirer
Point-to-Point
Channel
Publish-Subscribe
Channel
Competing
Consumers
Transactional
Client
Selective
Consumer
IMessageSender
Guaranteed
Delivery
IPoolManager
Message
Oriented
Middleware
Legend
Exclusive OR
Mandatory Feature
Optional Feature
Durable
Subscriber
Pool Size
Transaction
Size
(a) Annotated Model Element
Consumer Pool
Manager
(b) Result of the transformation
Figure 2: Running Example: Feature Model
Figure 4: MOM: Transformation Illustration
Running Example:.
To illustrate the application of the presented approach,
we use a real-world case study of message-based systems
introduced in [9] as a running example. In Figure 2, a feature
model describes the possible configurations of the Messageoriented Middleware (MoM).
The MoM Feature Model captures possible configurations
for a Messaging system. The configuration includes the type
of Messaging Channel as well as characteristics of the Sender
and Receiver. For example, a Messaging Channel can be
configured as a Point-to-Point Channel if only a single Receiver is needed. The Transaction Size is a property of the
Sender and expresses the amount of data (N × Message
Size) transferred in a transaction. Furthermore, the number of Competing Consumers at the Receiver’s side can be
specified. The choice of either of these features results in a
change of the architectural model. The complexity of these
changes varies from setting a parameter, through structural
changes, to globally changing the deployment of a whole
system.
In our case study, we consider a feature configuration with
the selected features: Competing Consumers, Pool Size of 4,
Transactional Client, TransactionSize of 100 messages, and
Message Size of 1 kilobyte.
transformation generation process by steps to automate the
transformation fragments development. The main contribution of this work is illustrated by the bold framing in Figure
1. We synthesise a template library of domain-specific refinement patterns. This is done on the basis of the supported
metamodel, which defines types of possible elements refining
the model. Additionally, based on the domain knowledge we
can identify more complex refinement patterns. Taking advantage of QVTs graphical syntax [3] we can easily represent
parametrised templates (with variation points) graphically.
The instances of these templates specify concrete Custom
Rules.
A deeper view in the process of transformation generation (c.f. Figure 6) shows the dependencies and connections
between the concepts introduced above. The process depends on the specification of several inputs for Higher-Order
Transformations. The first input is a Feature Model with attached Transformation Fragments (Custom Rules). These
fragments are used by a Higher-Order Transformation for
the actual refinement transformation generation. The second input is the actual Feature Configuration, which defines
which features are selected as well as the values of feature attributes. In contrast to an in-place transformation, a refinement transformation may also be specified to create a new
model where the refinements are applied. In this case, the
refinement transformation extends a copy transformation
(Frame). The Higher-Order Transformation includes the
Transformation Fragments into the generated copy transformation. The result of the Higher-Order Transformation
is a Refinement Transformation that when applied to an
Architectural Model generates the corresponding Refined Architectural Model. The line Meta-Level Boundary separates
the generation of the transformation and its application.
Custom Rules:.
In our approach, we define “transformation fragments” as
additional information attached to feature models. These
fragments are concrete implementations of Custom Rules
and are composed to a transformation depending on the
selection of features by a HOT. The transformation fragments are implemented in QVT Relational [3]. Basically,
these Custom Rules are fragments of model-to-model transformations that are attached to the individual features of the
feature model. A fragment of a transformation is activated if
its feature is selected in the configuration. Based on the selected combination of features, a refinement transformation
is generated. The explicit binding of fragments to features
reduces the complexity of transformations and, thus, alleviates their development and evolution. Our previous work
explains this concept in detail [5].
Running Example:.
Figure 4 illustrates the structural changes of a model resulting from a specific feature model configuration. The goal
of our approach is to generate a transformation that yields
this model automatically. Therefore, each feature is annotated with transformation fragments, as illustrated by Figure 3.
So far, we have introduced the general concept that can
be used to generate refinement transformations for architectural models with respect to specific feature configurations.
The goal of this work is to ease the transformation fragments
(Custom Rules) development. This is achieved through instantiation of Transformation Templates from Template Library. In the following section, we extend the concept by
configurable transformation templates that further facilitate
the specification of transformation fragments. This will fill
in the unexplained pieces of Figure 6.
Running Example:.
In our running example, the transformation fragments
sketched in Figure 3 express the effects of a feature selection
on an architectural model to which the refinements are to
be applied. In this example simplified fragments include in
the model middleware subsystem and transaction handling
for transactions of the size 100.
However, we identified that many of these transformation
fragments follow certain patterns. We extend the refinement
71
Messaging
top relation T_Client {
varSize : Integer;
checkonly domain in p :
Component {};
Message
domain out s:TP {
Channel enforce
size = varSize;
};
when {
TC(p,s);
}
where {
Point-to-Point
Publish-Subscribe
varSize = 100;--default
Channel
} Channel
}
top relation Messaging {
checkonly domain in p :
Component {};
enforce domain out s:MoM {
};
Receiver
}
Sender
Competing
Consumers
Selective
Consumer
Guaranteed
Delivery
Transactional
Client
Legend
Exclusive OR
Mandatory Feature
Optional Feature
Durable
Subscriber
Pool Size
TC_size.varSize = size;
Transaction
Size
Figure 3: Running Example: Transformation Fragments
3. CONFIGURATION-AWARE TRANSFORMATION TEMPLATES
Coupled Adaptor
Lock Requirer
Adaptor
Message
Sender
Adapter
The automated generation of refinement transformations
presented in the previous section significantly reduces the
effort needed to specify such transformations. However, the
Custom Rules still tend to contain a large set of similar
elements, especially for architectural models. Therefore, we
propose transformation templates as an additional means to
ease the specification of refinement transformations.
Transformation templates are stored in a Template Library (cf. Figure 6). New Custom Rules can be specified
instantiating and composing the existing Templates. Furthermore, templates are configurable by a set of parameter values. Based on the template and its configuration,
the Template Instantiation Transformation (HOT) creates
Template Instances and adds the necessary rules to the refinement transformation.
Each template represents a configurable specification of
a domain-specific pattern re-occurring in a transformation.
Figure 5 illustrates the set of patterns we have identified
so far for the running example. A Coupled Adaptor allows
sender and receiver to use the same message-oriented middleware. This pattern could be used in the case of refinement
by coupled actions, such as encryption and decryption, or
composition and decomposition. The Lock Requirer is used
when a component has to acquire a lock before accessing a
certain service and release a lock when finished. Same refinement pattern could be observed in the case of dependent
actions. In the example, this pattern is used for the Message
Receiver Adaptor component to acquire locks through the
IConsumerPoolRequirer interface. An Active Component
pattern is used to model a component with a complex internal behavior. This pattern refines the model with an
element introducing independent behavior branch. An additional wrapper is provided for the functionality defined as
an internal action of the component behavior. To provide a
queue for competing consumers the Lock Manager pattern is
used in the ConsumerPoolManager component. This pattern
could be used when introducing a state holding element to
the model. The Controller pattern is applied to the Clock
component to provide a wrapper for simple monitor functionality. The last pattern introduces a new functionality
Message
Receiver
Adapter
IMessageReceiverAdapter
IConsumerPoolRequirer
Active Component
Lock Manager
IMessageSender
IPoolManager
Message
Oriented
Middleware
Consumer Pool
Manager
(a) Pattern Illustration based on CaseStudy
Controller
Delegator
Clock
Delegator
(b) Additional Patterns
Figure 5: Templates introduction
into the refined model and could be independently required
by already existing model elements.
In the following section, we describe the adaptor pattern,
as a representative, in more detail. For the description
of transformation patterns, we use a standard description
schema for patterns defined in [10] and [4]. This includes
the following information: the name of the pattern, the goal
of the change, the motivation for the pattern, the specification of the template using the QVT-Relations language, and
an example for the pattern.
3.1
The Adaptor template
In this section, we illustrate the concepts introduced above
with the example of the adaptor pattern [10]. For the application within a refinement transformation further details
concerning the specific metamodel are necessary.
3.1.1 Goal:
Change the provided or required service interface.
3.1.2 Motivation:
When new functionality is needed in an architecture (for
72
Higher-Order
Transformation Chain
Configuration
instance of
Feature Configuration
Feature Model
Template
Library
Copy Higher-Order
Transformation
input
Transformation
Fragments
instance of
T1
Frame
extends
Custom Rules
TF1
T2
T1-Config1
TN
TFN
Ti-Configi
Ti
Template
copy
Instantiation
Transformation conf
Copy
Transformation
input
input
Higher-Order
Transformation
Refinements
Integration
Higher-Order
Transformation
output
TF1
input
TF2
Template
Instances
Ti-Instancei
TFN
output
Meta-Level Boundary
Legend
model(of transform.)
r/w access
active transform.
dependency
Architectural Model
Refinement
Transformation
Refined Architectural Model
generated
Figure 6: Transformation Process
targetInterface in the target domain .
example filtering) its configuration could result in a change
of a service’s signature (input or return parameters). The
change of the adapted interface is considered as a configurable change and allows developers to define changed attributes without the need to reimplement the whole transformation for the integration.
3.1.4 Example:
An example of an Adaptor is shown in Figure 5. This
Adaptor provides an interface to the message receiver and
adapts its required interface to communicate with used messaging middleware.
3.1.3
Specification:
The adaptor pattern is specified by a template that creates an Adaptor component which requires the interface provided by the adapted component and provides the interface
required by the calling component.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
3.1.5 Applicability:
The applicability of a pattern defines constraints for the
usage of a template. For the Adaptor template such a constraint is defined by the requirement that a variation point
should be of type interface.
Additional examples illustrating the variation point approach for model transformation templates are given in Table 1. The variation point types map known element types
for specification of component-based architectures (e.g. components, interfaces, signatures, resources, etc.).
transformation CBSE Adaptor (source: CBSE, target : CBSE) {
top relation Adaptor template CreateAdaptor {
checkonly domain source sourceInterface:{
−−adapted interface
<fromInterface:TemplateVariationPoint>
};
checkonly domain source targetInterface:{
<toInterface :TemplateVariationPoint>
};
enforce domain target adaptorComponent:{
name =<adaptorName:LiteralExpVariationPoint>
−−name
requiredRoles = reqRole:RequiredRole{
requiredInterface = sourceInterface }
providedRoles = provRole:ProvidedRole{
−−modified interface
providedInterface = targetInterface }
serviceEffectSpecifications =
−−behavior specification
seff : ServiceEffectSpecification{ . . . }
}
};
}
3.2
Template Definition Metamodel
To define a framework supporting the definition and configuration of transformation templates we need to describe
them and their variations in a general way. This description is provided by means of a metamodel introduced in this
section and illustrated by Figure 7. As a main element of
the transformation templates metamodel we introduce the
Template element. This element represents the concept of
a transformation template in our terminology and defines
a reconfigurable and reusable transformation fragment for
the model transformation generation. The Description of a
template contains a definition of the Goal of the template
as well as a textual Motivation for the Template definition.
Each Template defines the applicability, or usage scenarios,
by specifying an OCL Constraint. To be able to apply a template in a certain context this constraint needs to evaluate
to true. The Template element refers to a set of Relations
from the QVT Relational metamodel. These relations form
the basis of the template as they will be parametrised by
Listing 1: Template Specification of the Adaptor
pattern.
Additionally, based on a designer defined method mapping, it requires or provides a modified interface to another
component in the system. As illustrated in Listing 1, an
adaptorComponent is created with the modified interface
73
Template
Goal
Variation Point Type
Delegator
Provides a wrapper for a required or provided interface and
delegates additional information without adjusting the signature.
Interface
Coupled
Adaptor/Delegator
Adapts two interfaces allowing their communication. Or in a case of
delegation to allow them to use communication connection
together without changing their signatures.
Interface
Lock Requirer
Provides an interface requiring a software resource (thread pool,
queue or semaphore).
Interface
Lock Manager
Models a component providing a passive software resource (thread
pool, queue, semaphore).
Passive Resource
Active Component
Provides a component with its own, independent control flow
thread.
Component
Controller
Adds a mutex to all method calls allowing only a single thread to
access the component at one time.
Internal action
Table 1: Transformation Templates
VariationPoints as defined below. Furthermore, the Template definition contains a set of VariationPoints. These
variation points define possibilities for variations within the
basic relations. A VariationPoint is defined by a reference
to either a template expression (TemplateExp) or relation
domain (RelationDomain). These points are defined by subclasses of VariationPoint named TemplateVariationPoint,
DomainVariationPoint, and LiteralVariationPoint (for the
specification of variable literals within a template).
The association dependencies of the Template class expresses dependencies between transformation templates. Defined transformation templates depend on each other and
therefore these constructs need access to results of required
transformation templates.
The binding of a template to an actual transformation
fragment is done as soon as the template is referenced within
an actual transformation fragment that is defined for a concrete feature model. The actual application of the transformation template is defined by the TemplateConfig. For each
defined VariationPoint the template configuration includes
VariationPointInstances which bind the VariationPoint to
actual templates or relation domains specifications. VariationPointInstances can be assigned to multiple VariationPoints stemming from different transformation templates.
This yields the possibility to combine transformation templates to build more complex model transformations.
3.3
Figure 7: The Templates Metamodel
tion. Repository2Transformation creates a new transformation that will then contain the configured templates. Furthermore, AddTypedModels adds the model parameter of the
transformation to the transformation as they were specified
in the template repository. Each used and configured template is then added to the newly generated transformation
by the IntegrateRelations relation. All other template relations that were copied from the template repository by the
copy transformation will be ignored.
Further parts of the HOT are responsible for binding the
variation points of the templates to the elements from the
actual template configuration. Listing 3 shows the necessary
relations for binding a TemplateVariationPoint.
An extension to the generated QVT-R copy transformation is made by overriding the generated copy relations for
those elements that may be variation points in the templates. In the example above this would be all copy relations that inherit from TemplateExp. Listing 4 shows how
this is done for the ObjectTemplateExp. This extension
will cause the copy transformation to omit all TemplateExp that are variation points during the copy process. For
each binding that is configured in the template configuration
Template Instantiation
The instantiation process presented in Listing 2 is realized using a HOT. It merges the transformation using the
templates and creates a transformation based on the actual
configuration given by the template configuration model.
The first step of the TemplateInstantiation is the creation
of a copy of the relations that were specified within the
template. Therefore we use a generated copy transformation for the QVT-Relations metamodel1 . The TemplateInstantiation transformation extends this copy transforma1
The Mark_QVTRelation_Relation relation that is used here
is a part of this generated transformation. Using this, it is
possible to retrieve the copied instance of a given original
relation. For each class in the corresponding metamodel
such a relation exists.
74
the BindTemplateVariationPoint relation in Listing 3 will
call the Mark_QVTTemplate_TemplateExp relation.
1
2
3
13
15
16
17
18
top relation Library2Transformation{
n:String;
6
19
20
7
9
10
22
23
25
enforce domain target t : QVTRelation: :
RelationalTransformation {
name = n + ’ templateInstantiation ’
};
12
13
14
15
26
27
28
30
where { MarkTargetTransformation(t);}
17
31
}
18
32
19
21
22
23
Due to the functionality of the copy transformation this
will cause the copy relations to treat the substituted template as the copy of the original and will assign it to all points
in the template’s copy where the original template was used.
See [6] for a detailed description of the copy transformation
and how it works.
top relation AddTypedModels {
checkonly domain source templateRep:
templateRepository {
modelParameter =mm: QVTBase: :TypedModel {}
};
26
27
28
29
30
enforce domain target t : QVTRelation: :
RelationalTransformation {
modelParameter = mmCopy: QVTBase: :TypedModel {}
};
31
32
33
34
1
2
35
3
when { Repository2Transformation(templateRep, t );
Mark QVTBase TypedModel(mm, mmCopy);}
36
37
6
7
top relation IntegrateRelations {
n:EString;
40
41
8
9
42
10
checkonly domain source templateConfig:
templateDefinition : :templateConfig{
instanceOf = template:templateDefinition : :
template {
name = n,
templateRelations = templateRel :
QVTRelation: :Relation {}}
};
43
44
45
46
47
48
49
50
12
13
15
16
when {
not(sourceObjectTemplateExp=variationTemplate);
}
17
18
19
53
54
55
56
20
where {
Mark QVTTemplate ObjectTemplateExp(
sourceObjectTemplateExp, targetObjectTemplateExp);}
21
22
23
24
57
}
25
when {
MarkTargetTransformation(t );
Mark QVTRelation Relation(templateRel,targetRelation);
}
58
59
60
61
63
[...]
Listing 4: Overriding Copy Rules.
}
[...]
62
3.4
}
Examples of Transformation Templates
3.4.1 The Delegator Template
Listing 2: Higher-order transformation for instantiating templates.
Goal.
Provide a wrapper for a required or provided interface and
delegate its functionality based on the unchanged signature.
top relation BindTemplateVariationPoint {
n:EString;
variationPointBindings:OrderedSet(VariationPoint);
4
9
enforce domain target targetObjectTemplateExp:
QVTTemplate: :ObjectTemplateExp{};
14
enforce domain target targetRelation:
QVTRelation: :Relation {
name = n + ’ template ’ + templateRel.name ,
transformation = t : QVTBase: :Transformation{}
};
52
10
checkonly domain source sourceObjectTemplateExp:
QVTTemplate: :ObjectTemplateExp{};
11
51
8
checkonly domain source variationPoint:
templateDefinition : :TemplateVariationPoint{
template = variationTemplate :
QVTTemplate: :TemplateExp {}
};
5
39
7
−−Override the Generated Copy Rule:
top relation Copy QVTTemplate ObjectTemplateExp
overrides Copy QVTTemplate ObjectTemplateExp{
4
}
38
[...]
Listing 3: Binding of template variation points.
24
25
}
33
relation MarkTargetTransformation {
checkonly domain target t :QVTRelation: :
RelationalTransformation{};
}
20
6
where {
Mark QVTTemplate TemplateExp(variationTemplate,
targetTemplate);}
29
16
5
when {
Mark QVTTemplate TemplateExp(instanceTemplate,
targetTemplate);
variationPointBindings−>includes(variationPoint);}
24
11
3
enforce domain target targetTemplate:
QVTTemplate: :TemplateExp {};
21
checkonly domain source templateLib:templateLibrary{
domain = n
};
8
2
checkonly domain config variationPointInstance :
templateDefinition : :TemplateVariationPointInstance{
bindsTo = variationPointBindings,
template = instanceTemplate :
QVTTemplate: :TemplateExp {}
};
14
transformation templateInstantiation(source:templateDefinition,
config :templateDefinition, target :QVTRelation)
extends CopyQVTRelation{
5
1
};
12
4
64
QVTTemplate: :TemplateExp {}
11
Motivation.
A delegator can be used for example, when for each request a semaphore lock should be asked to allow access the
semaphore provider service before allowing the request to
reach the interface.
checkonly domain source variationPoint :
templateDefinition : :TemplateVariationPoint{
name = n,
relationTemplate = relationTemplate :
QVTRelation: :Relation {},
template = variationTemplate :
75
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
3.4.3 The Lock Requirer template
transformation CBSE Delegator (source: CBSE, target : CBSE) {
top relation Delegator template CreateDelegator {
checkonly domain source delegatedInterface:{
};
enforce domain target delegatorComponent:{
name =<delegatorName:LiteralExpVariationPoint>
requiredRoles = reqRole:RequiredRole{
requiredInterface = delegatedInterface }
providedRoles = provRole:ProvidedRole{
providedInterface = delegatedInterface }
serviceEffectSpecifications =
seff : ServiceEffectSpecification{...}
}
};
}
Goal.
To provide an interface requiring a software resource (thread
pool, queue or semaphore).
Motivation.
When component has to acquire a lock before accessing a
certain service and release a lock when finished.
Specification.
This template is specified by a relation that extends in a
model already existing component with an interface requiring an external service providing acquire() and release() on
a lock resource held be the called component. This specification implies an existence of an Lock Manager in a system.
Listing 5: Template Specification of the Delegator
template.
Specification.
This template is specified by a relation that creates a Delegator component that requires or provides a delegated interface to other components in the system. Additionally,
a Delegator could request services from other components.
This template could be used to generate the initial structures for this.
1
2
3
4
5
6
7
8
9
10
Example.
The example of a Delegator is shown in figure 5 as an
additional template. This Delegator provides interfaces to
the message receiver with the same interface.
11
12
13
14
15
16
17
Applicability.
For the Delegator template it is required that a variation
point is not of type interface.
transformation CBSE LockRequirer (source: CBSE, target : CBSE){
top relation LockRequirer template CreateLockRequirer {
checkonly domain source synchronizedInterface:{
};
enforce domain target lockRequirerComponent:{
name =<lockRequirerName:LiteralExpVariationPoint>
requiredRoles = reqRole:RequiredRole{
requiredInterface = synchronizedInterface,
requiredInterface =
<lockName:TemplateVariationPoint> }
providedRoles = provRole:ProvidedRole{
providedInterface = synchronizedInterface }
serviceEffectSpecifications =
seff : ServiceEffectSpecification{ . . . }
}
};
}
Listing 6: Template Specification of Lock Requirer
template.
3.4.2 The Coupled Adaptor/Delegator template
Example.
The example of a Lock Requirer is shown in figure 5 and
illustrated by extention to Message Receiver Adaptor Component with an IConsumerPoolRequirer interface.
Goal.
To adapt two interfaces and to allow their communication.
Or, in a case of delegation, to allow them to use communication connection together without changing their provided
functionality.
Applicability.
For the Lock Requirer template is required that variation
point should be of type LockManagerReference.
Motivation.
When it is needed to build a connector between two communicating components or to build a chain of delegators
to access certain external functionality in a certain state of
message delivery.
3.4.4 The Active Component template
Goal.
To provide a wrapper for a functionality defined as internal action of a component behavior.
Specification.
This template is specified by a relation that creates two
Delegator or Adaptor components that mirror their adapted
or delegated interface.
Motivation.
When it is needed to model a component with a complex
internal behavior.
Example.
The example of a Coupled Adaptor is shown in figure 5.
This construct allows sender and receiver to use the same
message-oriented middleware.
Specification.
This template is specified by a relation that creates an
Active component that requires or provides a delegated interface to the another components, depending on a developer
specification. In case of this template is the template only a
frame for implementation, it is the most complex template
with no restrictions on VariatonPoints.
Applicability.
For the Adaptor /Delegator template is required that variation point should/shouldn’t be of type interface.
76
Specification.
This template is specified by a relation that creates a Controller component that requires or provides a delegated interface to the another component. This component has only
a simple internal action defined and is creating processing
delay through computation.
Example.
The example of a Active component is shown in figure 5
and illustrated by Message-oriented Midleware.
Applicability.
There are no restrictions for this template. Consequently
this template requires higher user interaction to implement.
Example.
The example of a Controller is shown in figure 5 as an
additional template. This Controller provides a clock for a
connector.
3.4.5 The Lock Manager template
Goal.
To model a component providing a passive software resource (thread pool, queue, semaphore).
Applicability.
For the Controller template is required that variation point
should be of type internal action.
Motivation.
When a synchronization mechanism based on a lock strategy is used in a system.
4. RELATED WORK
In the domain of model transformation languages, transformation composition, transformation generation, and template definition for model transformations are relatively new.
Specification.
This template is specified by a relation that creates LockManager component that provides an interface with two
signatures acquire() and release() on its internal passive resource.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Transformation Composition and Configuration:.
Dealing with composition of transformations we are heading towards complex problems, that are in the focus of many
currently running research initiatives. One of them is [11]
which proposes a superimposition composition technique for
ATL and QVT Relations. Other works [12] and [13] investigate possibilities of composing complex transformations
from atomic transformation definitions. Our approach is different to these composition methods because it is based on
a predefined structure (i.e. the feature model) that guides
the transformation developer. Furthermore, our focus is
on metamodel-specific transformation generation and not
generic composition techniques. Therefore, many problems
that arise when trying to compose arbitrary atomic transformation parts are avoided.
transformation CBSE LockManager (source: CBSE, target : CBSE) {
top relation LockManager template CreateLockManager {
checkonly domain source appRepository:{
};
enforce domain target lockManagerComponent:{
name =<lockManagerName:LiteralExpVariationPoint>
requiredRoles = reqRole:RequiredRole{}
providedRoles = provRole:ProvidedRole{
providedInterface = lockInterface }
serviceEffectSpecifications = acquireLock
seff : ServiceEffectSpecification{ . . . }
serviceEffectSpecifications = releaseLock
seff : ServiceEffectSpecification{ . . . }
passiveResource = lock{
<lock :TemplateVariationPoint> }
};
}
Transformation Generation Approaches:.
The automated framework DUALLY [14] aims to answer
the issues concerning tools and languages interoperability.
This approach introduces the concept of transformation generation with the purpose of translating model specifications
from one language to another. The transformation generation is based on a mapping of elements between these languages. In its current state it is not able to generate refinement transformations.
Listing 7: Template Specification of Lock Manager
template.
Example.
The example of a LockManager is shown in figure 5. This
ConsumerPoolManager provides a queue for competing consumers.
Applicability.
For the LockManager template is required that the variation point is of type passiveResource.
Transformation templates or patterns:.
Iacob et al. [4] introduced an initial set of design patterns
for transformation specification. We extend this set with
transformation refinement templates that are specific to the
handling of design decisions based on specific metamodels.
Agrawal et al. [15] introduce a graph transformation language named GREAT and a set of templates for graph transformations. Additionally, work on templates defining transformation problems was published by [16].
3.4.6 The Controller template
Goal.
To provide a wrapper for simple monitor functionality.
Motivation.
When it is needed to model a component that only gains
and stores data, or provides some timing control. For example a clock component required by a connector or accessing
middleware, providing a control interface externally to set a
clock and providing an interface internally for other components in assembly to ask a clock.
5. CONCLUSION
In this paper, we described how writing of refinement
transformations can be simplified based on transformation
reconfiguration and generation. The configuration process
is based on explicitly defined feature models that also bear
77
the necessary transformation fragments to give their contribution to the refined architecture model. These transformation fragments are instances of pre-defined templates. Additionally, we have presented an approach to specify reusable
transformation templates that occur in transformation development for specific metamodels (domain). Based on these
templates refinement transformations can then be generated
using HOTs.
[9]
[10]
Limitations:.
[11]
Despite the advantages in simplifying the configuration
of transformations with our feature model based approach,
there are also some limitations that need to be discussed.
One particular drawback of our approach is the debuggability of the transformation. The debugger of the transformation engine will execute and observe only the generated and
woven transformation. Hence, a transformation developer
will need to understand the generated transformation in order to be able to debug it. A specialised debugger would be
needed if debugging should be possible on the configuration
level.
[12]
[13]
[14]
Future work:.
The presented work is a part of continuous research on
the automatic transformation composition and generation.
Further templates will be identified and a library of known
transformation templates will be extended in the future. Although the presented case study demonstrates the benefits
of the approach, we plan a quantitative evaluation and efficiency study (e.g. as a controlled experiment). Future work
will deal with the automatic derivation of templates from example models as it was proposed in [17]. This would greatly
ease the development of templates as the manual extraction
from an instance model to the transformation can be shortened significantly. Additionally, further development of the
retainment policies introduced in [18] to support migration
of manual changes on the models is needed.
[15]
[16]
[17]
[18]
6. REFERENCES
[1] M. Girschick, T. Kühne, and F. Klar, “Generating
systems from multiple levels of abstraction,” in
Conference on Trends in Enterprise Application
Architecture, 2006.
[2] M. Moriconi, X. Qian, and R. A. Riemenschneider,
“Correct architecture refinement,” IEEE Trans. Softw.
Eng., vol. 21, no. 4, 1995.
[3] MOF 2.0 Query/View/Transformation, version 1.0,
2008.
[4] M.-E. Iacob, M. Steen, and L. Heerink, “Reusable
model transformation patterns,” Enschede: Freeband,
2008.
[5] L. Kapova and T. Goldschmidt, “Automated feature
model-based generation of refinement
transformations,” in EUROMICRO (SEAA). IEEE,
2009.
[6] T. Goldschmidt and G. Wachsmuth, “Refinement
transformation support for QVT Relational
transformations,” in MDSE, 2008.
[7] K. Czarnecki and U. W. Eisenecker, Generative
Programming - Methods, Tools and Applications.
Addison-Wesley, 2000.
[8] L. Kapova, T. Goldschmidt, S. Becker, and J. Henss,
78
“Evaluating Maintainability with Code Metrics for
Model-to-Model Transformations,” in QoSA.
J. Happe, H. Friedrich, S. Becker, and R. H. Reussner,
“A Pattern-Based Performance Completion for
Message-Oriented Middleware,” in WOSP. ACM,
2008.
E. Gamma, R. Helm, R. Johnson, and J. Vlissides,
Design Patterns: Elements of Reusable
Object-Oriented Software, 1995.
D. Wagelaar, “Composition techniques for rule-based
model transformation languages,” in Conference on
Model Transformation - Theory and Practice of Model
Transformations, 2008.
J. Oldevik, “Transformation composition modelling
framework,” in Distributed Applications and
Interoperable Systems, 2005.
R. Marvie, “A transformation composition framework
for model driven engineering,” LIFL Ű IRCICA
University of Lille, Tech. Rep., 2004.
I. Malavolta, H. Muccini, and P. Pelliccione, “Dually:
A framework for architectural languages and tools
interoperability,” Conference on Automated Software
Engineering (ASE), 2008.
A. Agrawal, A. Vizhanyo, Z. Kalmar, F. Shi,
A. Narayanan, and G. Karsai, “G: Reusable idioms
and patterns in graph transformation languages,” in
GraBaTs. Elsevier, 2004.
E. D. Willink and P. J. Harris, “The side
transformation pattern: Making transforms modular
and re-usable,” ENTCS, 2005.
D. Varró, “Model transformation by example,” in 9th
International Conference on Model Driven Engineering
Languages and Systems (MODELS), 2006.
T. Goldschmidt and A. Uhl, “Retainment Rules for
Model Transformations,” in Workshop on Model
Co-Evolution and Consistency Management, 2008.
Advanced Modelling Made Simple with the Gmodel
Metalanguage
Jorn Bettin
Tony Clark
Sofismo
Lenzburg, Switzerland
School of Engineering and Information Sciences
University of Middlesex, London, UK
http://www.eis.mdx.ac.uk/staffpages/tonyclark/
t.n.clark@mdx.ac.uk
http://www.sofismo.ch/
jbe@sofismo.ch
ABSTRACT
The development of Gmodel relates to the second objective
of the KISS initiative [3], and builds on the KISS results
that have been achieved in 2009 [4]. In particular, Gmodel
represents an attempt to provide explicit tool support for
the full set of KISS principles:
Gmodel is a metalanguage that has been designed from the
ground up to enable specification and instantiation of modelling languages. Although a number of metalanguages can
be used for this purpose, most provide no or only limited
support for modular specifications of sets of complementary
modelling languages. Gmodel addresses modularity and extensibility as primary concerns, and is based on a small number of language elements that have their origin in model
theory and denotational semantics. This article illustrates
Gmodel’s capabilities in the area of model-driven integration by showing that the Eclipse Modeling Framework Ecore
language can easily be emulated. Gmodel offers support for
unlimited multi-level instantiation in the simplest possible
way, and any metalanguage emulated in Gmodel can optionally be equipped with this functionality.
1. The DSL must be meaningful to users
2. The DSL should be cognitively efficient
3. The DSL should have multiple notations where necessary
4. DSLs should offer mechanisms for modularising and
integrating models
5. The DSL should be supported by appropriate tooling
Categories and Subject Descriptors
D.2.2 [Software Engineering]: Design Tools and Techniques; D.2.12 [Software Engineering]: Interoperability
6. There must be an economic imperative for the development of a DSL
General Terms
7. The DSL must not be polluted with implementation features
Binding times, Denotational semantics, Domain analysis,
Graphs, Instantiation semantics, Metamodels, Model-driven
integration, Model theory, Modularity, Multi-level modelling,
Scope management, Value chain modelling
8. Model processing must always be based on a formal
DSL definition
9. DSLs should be kept small through modularisation and
integration
1. INTRODUCTION
In order to increase awareness about the role that domainspecific modelling languages can play in capturing, preserving, and exploiting knowledge in virtually all industries, it is
necessary to:
Since the design of Gmodel rests on mathematical concepts
from model theory and from the theory of denotational semantics, Gmodel can tap into established mathematical terminology, and the target audience for Gmodel includes modellers in all disciplines. Consistent with denotational semantics and with the third KISS principle, Gmodel completely
separates the concern of representation from the concern of
naming. This means that in contrast to most programming
language specifications, the specification of Gmodel does not
include a text-based concrete syntax.
1. Reach a consensus on fundamental values and principles for designing and using domain-specific languages
2. Progress towards interoperability between tools
– KISS Initiative, 2009 [4]
The authors of Gmodel believe that modelling has the greatest value when performed by domain experts, and if modelling language design takes into account established domain
notations. The challenge consists in providing a metalanguage that enables the most experienced domain experts to
define the notation for modelling in their field, whilst at the
same time providing tool support for enforcing (and ideally
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise,
or republish, to post on servers or to redistribute to lists, requires prior
specific permission and/or a fee. MDI2010, October 5, 2010, Oslo, Norway.
Copyright 2010 ACM 978-1-4503-0292-0/10/10...$10.00.
79
guaranteeing) the adherence to KISS modelling language design principles.
Beyond these two definitions, model theory defines the following concepts that every modelling language designer should
be familiar with: substructure, term, formula, variable, language, cardinality, sentence, theory, model [7].
This paper starts with a brief introduction of relevant terminology, and then presents the Gmodel metalanguage in the
context of model-driven interoperability based on practical
examples:
This terminology and the associated mathematical theory
have heavily influenced the design of Gmodel.
1. Introduction of Gmodel kernel concepts
2.2
2. Outline of Gmodel’s contribution to model-driven interoperability
The second source of influence on Gmodel is denotational
semantics, in particular the concepts of semantic domain
and semantic identity [11].
3. Representation of the Eclipse Modeling Framework Ecore
language in Gmodel, and outline of the design of a bidirectional bridge between the two technologies
4. Description of advanced modelling techniques for scope
management, modularisation, and interoperability
5. Comparison of Gmodel with other technologies
6. Conclusions
Denotational Semantics
One advantage of using established mathematical terminology to describe Gmodel is a low risk of terminological confusion with concepts from the Meta Object Facility (MOF), a
popular metalanguage that is steeped in object orientation,
and with concepts from related implementations such as the
Eclipse Modeling Framework Ecore language. This benefit
immediately becomes apparent when discussing the representation of Ecore in Gmodel. A second advantage of using
the above terminology is the ability to reason about Gmodel
in mathematical terms, without the need for any linguistic
gymnastics.
2. TERMINOLOGY
Modellers are not in the business of inventing new terminology, they are in the business of identifying concepts and links
between concepts that are useful for a particular community
of people – usually scientists or professionals in a particular field. This approach to modelling is consistent with the
Oxford American dictionary definition of modelling:
2.3
In addition to mathematics, Gmodel terminology draws on
concepts that have shaped the development of natural language, and the way in which humans perform work and exchange artefacts – including abstract ideas. In relation to
the latter, and in accordance with the second KISS principle,
the design of Gmodel takes into account human cognitive
abilities and limitations [10].
to model devise a representation, especially a mathematical one of (a phenomenon or system)
2.1
Natural Language and Exchange of Artefacts
Model Theory
language artefact A container of information that:
The mathematical definitions from model theory make use of
the concepts of sets and graphs, and provide a mathematical
basis for reasoning about models:
1. is created by a specific actor (human or a system)
2. is consumed by at least one actor (human or system)
structure A structure A is a set that contains the following
four sets
3. represents a natural unit of work (for the creating
and consuming actors)
1. A set called the domain of A, written as dom(A)
2. A set of elements of A called constant elements,
each of which is named by one or more constants
4. may contain links to other language artefacts
5. has a state and a life-cycle
3. For each positive integer n, a set of n-ary relations
on dom(A), each of which is named by one or
more n-ary relation symbols
model artefact A language artefact that meets the following criteria:
4. For each positive integer n, a set of n-ary operations on dom(A), each of which is named by one
or more n-ary function symbols
1. It is created with the help of a software program
that enforces specific instantiation semantics (quality related constraints)
signature The signature of a structure A is specified by
giving the set of constants of A, and for each separate
n > 0, the set of n-ary relation symbols and the set of
n-ary function symbols – The symbol L is used to represent signatures and languages; if A has a signature
L, A is also called an L-structure.
2. The information contained in a model artefact can
be easily processed by software programs (in particular transformation languages)
3. Referential integrity between model artefacts is preserved at all times with the help of a software pro-
80
gram (otherwise the necessary level of completeness and consistency is neither adequate for automated processing nor for domain experts making
business decisions based on artefact content)
edge traces Directed links from one edge to another edge
The Gmodel vertex and all four types of links are encoded
as sub sets of graph. In order to serve as a metalanguage,
edge ends are decorated with variables for minimum cardinality and maximum cardinality, as well as variables that
represent the direction of navigability of edges and a notion
of containment relating to the connected set.
4. No circular links between model artefacts are allowed at any time (a prerequisite for true modularity and maintainability of artefacts)
5. The life-cycle of a model artefact is described in
a state machine (allowing artefact completeness
and quality assurance steps to be incorporated
into the artefact definition)
3.1
instance A set that seems to contain one and only one
element at any given point in time from the view point
of a specific actor
instantiation function A function that returns an instance
– sometimes instantiation functions are also called concretisation functions
visibility Visibilities are links between model artefacts that
set the architectural context for artefact producers by
declaring the model artefacts that can be used as inputs for the creation of specific kinds of model artefacts
producer An actor that creates language artefacts
consumer An actor that consumes language artefacts
Instantiation
A modeller may use the instantiation function of Gmodel
kernel to create representations of vertices and links. Since
vertices are encoded as a sub set of graph – and hence enable
the representation of nested abstractions, vertices are well
positioned to serve as the unit of modularity in Gmodel.
Using the terminology introduced above, vertices play the
role of model artefacts, and in the context of Gmodel (modelling), are simply referred to as artefacts.
Links between artefacts are also encoded as a sub set of
graph, and therefore are also capable of representing nested
abstractions by containing sets of vertices and sets of links.
Links between two artefacts are always contained in the artefact that contains the first of the two artefacts connected by
the link, which is one of the constraints that allows Gmodel
to fulfil the fourth criteria of the definition of model artefact
– effectively enforcing much stronger rules regarding modularity than the minimum expectations set by KISS principles
(4) and (9).
The most powerful feature of Gmodel instantiation is the
ability to decorate any Gmodel artefact with instantiation
semantics (or concretisation semantics) relating to representations of less abstract (or more concrete) sets, such that the
artefact becomes instantiable. The instantiation semantics
available in the Gmodel kernel are specified via the variables
for cardinalities, navigability, and containment that are part
of all edge ends of Gmodel edges. Thus, on the one hand,
by excluding any circular links between artefacts, Gmodel
imposes heavy constrains on the models that can be created,
but on the other hand, Gmodel allows an unlimited degree
of freedom with respect to the number of instantiation levels.
value chain A value chain consists of actors (systems and
humans) that consume artefacts as inputs and produce
derived artefacts
3. THE GMODEL KERNEL
The Gmodel kernel [figure 1] is a semantic domain consisting
of a set of semantic identities that reify the concepts of ordered pair, ordered set, and graph – the latter consisting of a
set of vertices and a set of edges. To facilitate extensibility
and multi-level instantiation, the encoding of the Gmodel
kernel is entirely expressed in Gmodel semantic identities,
and each semantic identity in the kernel is defined as an instance of itself, and as a sub set of the next simpler semantic
identity in the kernel.
Gmodel does not mandate a layered metamodel architecture. Our modelling experience in software intensive industries has taught us that the model pattern known as the
power type pattern in object orientation occurs pervasively
in highly configurable systems. The power type pattern is a
technical kludge that forces the fragmentation of semantic
identities, and it clearly demonstrates the limits of the object oriented paradigm – which is currently still treated as
dogma by many software engineers. By allowing multi-level
instantiation, the need for the power type pattern is eliminated, and the fragmentation of semantic identities can be
avoided.
The generic term to refer to any semantic identity that is
expressed in Gmodel is the set. The simplest semantic identity is the ordered pair. Ordered pairs are used to define
ordered sets and graphs. Much of the power and simplicity
of Gmodel has its origin in the specific encoding chosen for
graphs. Instead of consisting of a set of vertices and a set
of edges, a Gmodel graph is encoded as a set of vertices and
several complementary ordered sets of links:
edges Links between two sets with a dedicated edge end for
each connected set
3.2
Surface notation
The name Gmodel is motivated by the graph concept, and
all notations for visualising graphs are good candidates for
a concrete syntax for Gmodel artefacts. In contrast, purely
text-based representations are only practical for representing Gmodel artefacts with certain characteristics, such as
super set references Directed links from a sub set to a
super set
visibilities Directed links from one sub graph to another
sub graph
81
Figure 1: The Gmodel kernel within a typical usage context
artefacts with a low ratio of edges to vertices. Additionally,
given that the main target audience for Gmodel consists
of modellers in general – as opposed to software engineers
with a strong preference for working with formal text-based
languages, there is no urgent need for developing a human
readable purely text-based syntax.
The Gmodel open source project has no intention of reinventing XML or burdening the world with yet another XMLbased but not-quite-human-readable syntax. The Gmodel
API can easily be used to build graphical editors for Gmodel
artefacts that are complemented with appropriate form-based
representations of variables and their values. At this point in
time Gmodel provides two complementary graphical surface
notations for visualising model artefacts.
1. Modularity – The implementation of the artefact concept prevents users from constructing circular dependencies between modules. In contrast to other technologies, Gmodel allows the modelling of links between
primary model artefacts and derived artefacts [figure
6], which amounts to an in-built infrastructure for orchestrating model transformations.
Model artefact storage
Gmodel internally uses a serialisation format that is not intended for human consumption, and it provides a binding
of this serialisation format to relational database technologies. In particular Gmodel fulfils criteria (3) of the definition of model artefacts, and provides explicit support for the
semantics of unknown and the semantics of not applicable.
As needed, the serialisation format can be bound to alternative persistence mechanisms such as file systems, object
databases or cloud database technologies.
2. Simplicity – Since the Gmodel kernel treats links between concepts as first-class constructs, all kinds of
graph structures (including undirected graphs) can be
represented without compromise, and in the preferred
terminology of the user. As a result, representations of
modelling languages within Gmodel tend to be highly
compact [figure 6], and the complexity of any required
model transformations is reduced accordingly.
3. Multi-level-modelling – Gmodel is not limited to the
four layered metamodel architecture. This opens up
new approaches with respect to interoperability [5] [2],
since types – and therefore interoperability patterns,
can be encoded in Gmodel to any level of complexity.
As illustrated in this article, multi-level instantiation is
a prerequisite for emulating “foreign” modelling technologies. We are not aware of any other multi-level
modelling technology that is ready for industrial use.
In order to work with model artefacts, Gmodel includes a
repository API that currently offers basic artefact search
functionality, which will be significantly enhanced in future
releases.
3.4
4. CONTRIBUTION TO MODEL-DRIVEN
INTEROPERABILITY
To a significant degree the development of Gmodel was motivated by a lack of adequate technologies for formal modelling
beyond the realm of software engineering and programming
languages, and by a lack of interoperability between existing tools for domain-specific modelling. Gmodel simplifies
model-driven interoperability in the following areas:
The first graphical surface notation for artefacts is based
on boxes and lines, and uses Unified Modeling Language
style syntax elements for edges (like UML associations), visibilities (like UML dependencies), as well as generalisation
references (like UML generalisations). All symbols that constitute the decoration of an artefact with instantiation semantics are coloured in red [figures 4 and 5]. The second
graphical surface notation for artefacts is based on nested
boxes that precisely mirror the set containment structure of
an artefact. A generic graphical editor that allows artefacts
to be created and modified is under development.
3.3
ticle focuses on the level of profound semantic interoperability with other metalanguages that can be achieved by
making use of multi-level instantiation to emulate “foreign”
technologies. Gmodel also offers an alternative for partial
and superficial interoperability via file based information exchange. Out of the box Gmodel includes integration with
the Eclipse integrated development environment, and with
the openArchitectureWare Xpand template/transformation
engine, putting text or code generation at the user’s fingertips.
Interoperability mechanisms
There are two main ways of achieving interoperability between Gmodel and other modelling technologies. This ar-
82
4. Scope management – Gmodel has an explicit feature
for scope management that is universally available within
all modelling languages expressed in Gmodel [figure
9]. This gives designers of modelling languages and
system architects an unprecedented amount of control
over the artefacts that language users can instantiate.
In the experience of the authors, such functionality is
essential for managing the dependencies between languages and between components in large-scale software
intensive systems.
5. Separation of the concern of modelling from the concern of naming – Right down to the core Gmodel functionality is expressed in semantic identities, and these
identities can be referenced from as many representations (models) as needed. In practical terms this
allows Gmodel to incorporate custom terminology and
jargons at all meta levels.
5.1.2 Representing the representation of Ecore
To prepare for the representation of Ecore in itself (the
metametamodel level in the classical four layered metamodel
architecture) in Gmodel, we instantiate a model artefact
(with meta element vertex ) based on the semantic identity
Ecore that has been defined as part of the EcoreDomain in
the previous step. Loosely speaking we now have an empty
model artefact called Ecore.
We can then proceed to add contained artefacts to the Ecore
artefact that correspond to the Ecore generalisation/ specialisation hierarchy that starts with EObject. Once this is
done we can represent the entire Ecore generalisation/ specialisation hierarchy in the Ecore artefact using super set
references as shown in figure 3, and we can represent all instances of EReferences in the Ecore artefact within Gmodel
as illustrated in figure 4.
6. Portability – In contrast to many other modelling technologies Gmodel makes no assumption about the implementation and legacy technologies that modellers
are going to drive from their model artefacts. The
Gmodel kernel is highly portable. It is articulated using the concepts presented in this paper, and makes
use of the Java programming language to bootstrap
the nine kernel concepts of ordered pair, ordered set,
graph, vertex, edge, edge end, super set reference, visibility, and edge trace – but without exposing the Java
type system in the core API, whilst restricting internal use of Java types to a handful: boolean, int, List,
Iterator, UUID, and String.
Lastly we add all relevant variables to the elements of the
Ecore artefact, making use of appropriate semantic identities from the EcoreDomain.
The whole process of representing Ecore in Gmodel is straightforward modelling in Gmodel, and requires no coding in a
programming language.
5.2
Representing Ecore models
The representation of Ecore models (the metamodel level in
the classical four layered metamodel architecture) in Gmodel
follows the same pattern as the representation of Ecore in
itself in Gmodel. First, appropriate semantic identities must
be defined, and then the Ecore model artefact can be instantiated to obtain an empty model artefact. Note that above
we instantiated a vertex to obtain a model artefact with the
Ecore semantic identity, and now we are instantiating this
model artefact.
5. EMULATING ECORE IN GMODEL
Gmodel clearly distinguishes between semantic domains and
models. The former simply contain sets of semantic identities, whereas the latter contain representations of semantic
identities from the view point of a particular actor.
5.1
in the encoding of Ecore in itself corresponds to a semantic
identity.
Representing the Ecore metamodel
In Gmodel no model can be constructed without referencing
elements in the relevant underlying semantic domains.
Just as above, the next step consists of adding contained
artefacts to the model artefact, this time however the meta
elements of the contained artefacts correspond to Ecore concepts. Up to this point there is nothing special about using
Gmodel. We could turn the table and proceed with very
similar steps in Ecore to obtain a reasonable representation of Gmodel – “reasonable”, because Ecore actually lacks
one instantiation level to provide a precise representation of
Gmodel edges. But instead of delving into the encoding details of Gmodel edges, the following step in encoding Ecore
models is straightforward to follow, and clearly illustrates
where multi-level instantiation plays a critical role.
5.1.1 Defining the Ecore semantic domain
In Ecore the most generalised element is the EObject, and
all other elements are part of a generalisation/specialisation
hierarchy that starts with EObject. To represent Ecore in
Gmodel, the first step consists of instantiating the semantic domain EcoreDomain, which contains all the semantic
identities that appear in Ecore [figure 2]. This step will be
perceived as somewhat unusual by all those who are only
familiar with the definition of text-based languages using
EBNF-style grammars; as the concern of representation and
the concern of naming are one and the same in such specifications.
The number of semantic identities required to represent Ecore
is significantly larger than the number of elements that appear in the Ecore generalisation/specialisation hierarchy. Every instance of an EDataType, every instance of an EReference, every instance of an EAttribute, etc. that occurs in the
encoding of Ecore in itself requires a corresponding semantic identity. Loosely speaking, everything that has a name
83
In Gmodel we can proceed to represent all instances of EReferences as demanded by the Ecore model we are emulating,
and we can use the edges that Gmodel uses to represent
EReference instances to record the cardinalities pertaining
to the instantiability of the model artefact. In a metalanguage without multi-level instantiation we would already
have hit rock-bottom at this point. We would have been
able to express links between elements (which, depending on
the metalanguage, may be called “references”,“association”,
“relationships”, “connections”, “edges” or similar – the name
Figure 2: The Ecore semantic domain
Figure 3: Encoding of Ecore super types
Figure 4: Encoding of Ecore references
Figure 5: Encoding of an entity relationship modelling language in the Ecore emulation
84
Figure 6: Native encoding of an entity relationship modelling language in Gmodel
is immaterial), but we would not have been able to decorate
these links with cardinalities etc., which constitute essential
instantiation semantics for the next level of instantiation or
concretisation.
by an event-based mechanism to create dynamic interoperability between the two technologies, opening up interesting
avenues for model-driven systems that exploit the strengths
of both technologies.
5.3
6. ADVANCED MODELLING TECHNIQUES
Representing instances of Ecore models
Modularity and scope management go hand in hand. One
without the other is of very little value.
Given the explanations above, it is obvious how to proceed
to instantiate Ecore models (the model level in the classical
four layered metamodel architecture) in Gmodel such as the
example shown in figure 5. The comparison with figure 6
illustrates the complexity introduced one the one hand by
the emulation, and on the other hand by the Ecore encoding
of links between concepts in the form of EReferences.
5.4
6.1
Representing instances of instances of Ecore
models
In Gmodel there is no reason to stop modelling at the “model”
level. If the modeller has invested in decorating a model
artefact with instantiation semantics, Gmodel is capable of
applying these semantics – regardless of the level of instantiation or concretisation [figure 7].
Scope management via visibilities
Gmodel requires users to be explicit about scope. A model
artefact may not reference any element in other model artefacts unless these artefacts have been declared to be visible
from the first artefact [figure 9]. In contrast to most programming languages, declarations of visibility are not part
of an artefact, but they are part of the parent artefact in the
so called artefact containment tree.
In practical terms multi-level instantiation allows the modeller to instantiate operational data right down to the concrete level (the instance level in the classical four layered
metamodel architecture) – where Joe Bloggs owns life insurance policy number 123456 [figure 8].
The parent artefact has the responsibility of providing the
architectural context for all the artefacts that it contains.
The authors of Gmodel consider it to be good modelling
practice to associate every artefact with a producer, and
to identify and name the binding time that is associated
with the instantiation of an artefact. Experience from many
large-scale software system development initiatives has consistently confirmed the usefulness of this approach to system
analysis and modularisation.
Given that industrial-strength relational database technology is the default storage format used by Gmodel, navigating and maintaining large databases or data warehouses is
simply a matter using the Gmodel repository for navigation,
and of using Gmodel’s instantiation function.
The encoding of Ecore in Gmodel required the declaration
of a small number of visibilities, but there are much better practical examples that can be used to demonstrate the
value of scope management via visibilities – a topic that goes
beyond the scope of this paper.
5.5
Interoperability between Ecore and Gmodel
With the native encoding of Ecore emulated by Gmodel
artefacts, building a bi-directional bridge between the two
technologies has become a trivial task. The Ecore API can
be used to systematically read EMF models (at the metamodel level and the model level in the classical four layered
metamodel architecture), and the retrieved in-memory representations can be mechanically mapped to corresponding
in-memory representations in the Ecore emulation within
Gmodel.
Visibilities offer significant value to intensive users of EMF,
as Ecore lacks a corresponding facility. By switching from
the native implementation of Ecore to the Gmodel Ecore
emulation, EMF users gain access to the use of visibilities, and hence obtain a powerful tool for actively managing/restricting the dependencies in large-scale Java component architectures.
6.2
Applications of multi-level instantiation
6.2.1 The bottomless pit of abstractions
Gmodel is a technology that allows the construction of modeldriven systems on a new scale, whereas EMF Ecore is a
technology with an established user base and a vast array of
useful transformation and generator components that facilitate the binding to popular Java implementation technologies. A bridge between Ecore and Gmodel can be driven
85
Gmodel incorporates the insight from experienced modellers
that there is no absolute rock-bottom concrete level of models. Life insurance policy number 123456 only looks like an
instance from the view point of the average policy holder.
From the view point of the insurer a specific version of the
policy that is active for a certain interval is a more appropriate perception of instance. If, in 2020, Joe Bloggs decides
Figure 7: Snippet from an entity relationship model of a CRM application
Figure 8: Joe Bloggs’ life insurance policy number 123456
Figure 9: Example of visibility declarations
86
tends to have a preferred terminology or jargon for their
specific interactions, and such jargon is often a valuable tool
for disambiguation.
to shift his entire life into n virtual worlds (given the track
record of software technology, who would want to put all
eggs in one basket), his view point will shift. Life insurance
policy number 123456 in Second Life may be considered to
be one instance, and the corresponding policy representation
in Third Life may be considered to be a different instance –
perhaps the currency in which premiums are being paid is
different in each of the virtual worlds.
Without the systematic use of semantic identities, establishing interoperability across an entire value chain is significantly complicated. Names end up being used in the
definition of protocols and artefacts, and the reliability of
links between the participants in the value chain and communication across the links suffers accordingly.
6.2.2 Value chain modelling and mass customisation
If the above sounds far fetched, analysing the typical evolution of technology products over a period of several years
provides further motivation for multi-level instantiation. Since
the 1970s software has been used as a tool to not only automate industrial production, but also to extend the degree
to which technology products can be configured and customised without having to resort to manual manufacturing
techniques. Mass customisation has become commonplace
in many industries.
The evolution of a product over longer stretches of time
can be modelled as a series of instantiation levels. Adding
a new set of configuration options equates to adding additional variables to an artefact that used to be perceived as
an instance. What used to be called a product morphs into
a product line, and the new products are the instances of
the product line, where each of the variables take on concrete values. The view point of the customer usually remains
unaffected, she still buys instances of a product.
7. OTHER TECHNOLOGIES
The level of interoperability between current domain-specific
modelling tools is comparable to the level of interoperability
between CASE tools in the 90s. To increase the popularity
of model based approaches, this needs to change. The assumption that all parties in a global software supply chain
will use identical tooling is simply not realistic.
7.1
Research prototypes
We are aware of at least three research prototypes with some
form of multi-level instantiation capability [1], [8], [9], [6].
It would be extremely interesting to compare the design of
Gmodel with the design of these technologies.
Within a non-trivial value chain, the variables associated
with a product line tend to be replaced by concrete values
in a series of stages, so called binding times. Each binding
time is associated with a specific actor that is responsible for
making decisions regarding the values relating to a specific
set of variables. In our experience multi-level instantiation
is by far the simplest modelling technique for representing
non-trivial value chains.
7.2
Eclipse Modeling Framework Ecore
In this paper we have illustrated how Gmodel can be used
to emulate the Ecore technology, and conversely we have
highlighted some of the limits of Ecore, in particular the
lack of support for multi-level instantiation.
7.3
MetaEdit+
MetaEdit+ is mature metamodelling and modelling tool that
compares favourably with the Eclipse Modeling Framework.
In particular the metametamodel of MetaEdit+ is simpler
than the metametamodel used by Ecore, without any sacrifice in expressive power. But just as Ecore, MetaEdit+
follows the four layered metamodel architecture dogma and
does not offer multi-level instantiation. As a result, MetaEdit+
runs into the same limitation that Ecore runs into when attempting to emulate “foreign” modelling technologies.
The alternative of using a purely object oriented design, in
combination with the classical power type pattern, leads
to system designs that are much more complex and much
less maintainable than they could be. In particular the traditional distinction between design-time and run-time is a
dangerous over-simplification that distracts from the need of
proper value chain analysis (also known as domain analysis
in the discipline of software product line engineering).
6.3
It is worthwhile to note that semantic identities are not only
applicable at the atomic level to define identities such as
TRUE and FALSE, but are just as applicable to statements
such as minimum cardinality = 1 or to aggregates such as
the entire Ecore model artefact.
Applications of denotational semantics
Similar to Gmodel, MetaEdit+ relies on database technology rather than a file system for the storage of model artefacts, enabling modellers to build large-scale model-driven
systems, but without explicit scope management facilities.
Since all model artefacts in Gmodel are constructed from
semantic identities, and since semantic identities are the only
Gmodel elements that have names, semantic identities offer
a one-stop-shop for dealing with all aspects of naming. This
greatly facilitates any required translation between different
terminologies, and it even enables users to replace the names
of the semantic identities in the Gmodel kernel. If a user
prefers to call a vertex a node, or if she prefers to rename
TRUE to FALSE and FALSE to TRUE, so be it. The role
of modelling is representation and not naming.
7.4
Unified Modelling Language tools
The main target audience of UML consists of software professionals who have an interest in visualising code, especially
object oriented code. Most UML tools only offer very limited – if any – functionality for instantiating models that
users have created. Since UML is based on the Meta Object
Facility (and on Ecore or similar implementations), UML
tools are affected by the kinds of limitations discussed in
this paper in relation to Ecore.
Separating the concern of modelling from the concern of
naming adds value precisely because good terminology is so
important. Each pair of collaborating actors in a value chain
87
7.5
Programming languages
We hope that Gmodel offers the missing stepping stone that
allows a much larger group of organisations to reap the benefits of formal modelling, by significantly reducing the number of concepts and technologies that a designer of modelling
languages needs to be familiar with, and by offering features
– such as multi-level instantiation – that lead to simpler and
clearer designs. Interoperability with EMF Ecore as outlined
in this article is currently being refined, and a bi-directional
bridge between Gmodel and Ecore will be a feature in an
upcoming release of Gmodel.
There are several programming languages that offer multilevel instantiation, and there are also a number of programming languages that are based on denotational semantics,
such as LISP or REBOL. Whilst these language have expressive power that is comparable to Gmodel, they don’t
offer the limitations and constraints that have consciously
been built into Gmodel.
Programming language designers approach language design
from a view point that differs significantly from the view
point of a modelling language designer.
No modelling tool can ever replace the need for domain
analysis, but Gmodel is ideally positioned to record the results of domain analysis. On the one hand Gmodel provides
domain-specific modelling support for all participants in a
value chain, and on the other hand it serves as a front end
for model transformation and code generation technologies
that allow models to be glued to existing technologies and
legacy systems.
1. A programming language is designed to be executable
on a specific platform. The platform represents the solution space, and the implementations of programming
languages are optimised with respect to using the resources offered by the platform.
2. Since most programming languages are general purpose languages, they have to offer features that cover
the needs of a big range of different users. As a result
programming languages offer many features that are
not strictly needed by the majority of users. These
features lead to additional degrees of freedom in solution designs, and consequently lead to variations in
implementation that are induced by personal design
preferences of individual software engineers. In the
small this may not matter, but in the large these variations are known as spurious complexity.
9. REFERENCES
[1] Colin Atkinson, Matthias Gutheil, and Bastian
Kennel. A flexible infrastructure for multilevel
language engineering. IEEE Trans. Softw. Eng.,
35(6):742–755, 2009.
[2] Jorn Bettin and Tony Clark. Gmodel, a language for
modular meta modelling. In Australian Software
Engineering Conference, KISS Workshop, 2009.
[3] Jorn Bettin and Tony Clark. The knowledge industry
survival strategy initiative (kiss), 2009.
[4] Jorn Bettin, William Cook, Tony Clark, and Steven
Kelly. Knowledge industry survival strategy (kiss):
fundamental principles and interoperability
requirements for domain specific modeling languages.
In OOPSLA ’09: Proceeding of the 24th ACM
SIGPLAN conference companion on Object oriented
programming systems languages and applications,
pages 709–710, New York, NY, USA, 2009. ACM.
[5] Tony Clark, Paul Sammut, and James Willans.
Applied metamodelling: A foundation for language
driven development, 2008.
[6] Tony Clark, Paul Sammut, and James Willans.
Superlanguages: developing languages and
applications with xmf., 2008.
[7] Wilfrid Hodges. A shorter model theory. Cambridge
University Press, New York, NY, USA, 1997.
[8] A. Laarman. An ontology-based metalanguage with
explicit instantiation, March 2009.
[9] Alfons Laarman and Ivan Kurtev. Ontological
metamodeling with explicit instantiation. In
M. van den Brand, D. Gaševi?, and J. Gray, editors,
Software Language Engineering, volume 5969 of
Lecture Notes in Computer Science, pages 174–183,
Heidelberg, January 2010. Springer Verlag.
[10] Tomasello M, Carpenter M, Call J, Behne T, and Moll
H. Understanding and sharing intentions: The origins
of cultural cognition. Behavioral and Brain Sciences,
28, 675 - 691, 2005.
[11] David A. Schmidt. Denotational semantics: a
methodology for language development. William C.
Brown Publishers, Dubuque, IA, USA, 1986.
3. A modelling language is designed for the representation of specific kinds of problems. As outlined in this
article, problem spaces are best modularised along the
lines of the actors that participate in a value chain, and
each actor must be equipped with modelling languages
that have a clear focus on the specific context and view
point – all other details must be abstracted away. The
result is a design force that pulls in the opposite direction of the design force that drives the development
of most programming languages. The most valuable
modelling languages are not only domain-specific, they
are company specific.
Gmodel is a metalanguage that strives to provide expressive
power in those areas that matter to modellers, and at the
same time it strives to restrict those expressive powers that
may lead to non-maintainable artefacts.
8. CONCLUSIONS
Although Gmodel is a brand new metalanguage, it embodies
the collective lessons from many experienced modellers. The
specific constraints that have been built into Gmodel have a
track record of many years in industrial practice. Up to now
best practices for scope management and modularity had
to be applied manually, in the form of conventions. This
worked up to a point, but it posed limits to the scalability
of modelling technology in large environments.
Without appropriate tool support, designing and maintaining advanced model-driven systems requires a large number
of highly skilled modellers and system architects, and often the required level of expertise is simply not available.
88
Model-driven Rule-based Mediation in XML Data Exchange
Yongxin Liao1, Dumitru Roman2, and Arne J. Berre2
SINTEF ICT
Forskningsveien 1, Oslo, Norway
1
yongxinliao@gmail.com, 2{firstname.lastname}@sintef.no
ABSTRACT
1. INTRODUCTION
XML data exchange has become ubiquitous in Business to
Business (B2B) collaborations. Automating as much as possible
the exchange of XML data between enterprise systems is a key
requirement for ensuring agile interoperability and scalability in
B2B collaborations. The lack of standardized XML canonical
models or schemas in B2B data exchange, as well as semantic
differences and inconsistencies between conceptual models of
those that want to exchange XML data implies that XML data
cannot be directly and fully automatically exchanged between
B2B systems. We are left with the option of providing techniques
and tools to support humans in reconciling the differences and
inconsistencies between the data models of the parties involved in
a data exchange. In this paper we introduce such a technique and
tool for XML data exchange. Our approach is based on a lifting
mechanism of XML schemas and instances to an object-oriented
model, and the design and execution of data mediation at the
object-oriented level. We use F-logic – an object oriented rule
language – together with its Flora2 engine as the underlying
mechanism for providing an abstract, object-oriented model of
XML schemas and instances, as well as for specification and
execution of the mappings at the model level. This provides us
with a fully-fledged tool for design- and run-time data mediation,
by focusing at the actual semantic models behind the XML
schemas, rather than having to deal with the technicalities of
XML in the data mediation process. Finally, we present the
architecture of the current data exchange system and report on
preliminary evaluation of our system.
Providing techniques and tools to improve the level of automation
of XML data exchange in B2B collaborations is widely regarded
as a key enabler for agile interoperability and scalability in B2B
collaborations [1]. In this paper we introduce a technique and tool
for design- and run-time support of XML data exchange. Before
we give a brief overview of the approach, let us define in more
details the problem of XML data exchange in the context of B2B
collaborations.
Since we assume the data sent and received by parties in a B2B
collaboration to be in XML, we face the problem of XML data
transformation. Figure 1 provides an overview of the elements
involved in XML data transformation and the process by which
an XML document is transformed into another document.
Company X (depicted on the left side of the picture) wants to
send the Source XML document (e.g. an invoice) to Company Y.
The Source XML document is compliant with an XSD schema
(Source XSD) made available by Company X such that the
receivers of its XML documents can understand the structure and
meaning of such documents. Company Y (on the right side of the
figure) processes XML documents (in our case Target XML)
according to its own schema Target XSD. If Target XSD differs
from Source XSD, then company X is faced with the problem of
having to process the Source XML document which it does not
understand.
Company X
Transformation
Layer
Company Y
Categories and Subject Descriptors
D.2.12 [Software Engineering]: Interoperability, D.2.2 [Design
Tools and Techniques], H.2.5 [Heterogeneous Database]
Source
XSD
Schema
Transformation
Target
XSD
General Terms
Algorithms, Design, Experimentation, Languages
Design-Time
Run-Time
Keywords
XML Data Exchange, Data mediation, Semantic mapping
Source
XML
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that
copies bear this notice and the full citation on the first page. To copy
otherwise, or republish, to post on servers or to redistribute to lists,
requires prior specific permission and/or a fee.
MDI2010, October 5, 2010, Oslo, Norway.
Copyright 2010 ACM 978-1-4503-0292-0/10/10…$10.00.
Instances
Transformation
Target
XML
Figure 1. Generic design-time and run-time XML data
transformation.
89
target at the object-oriented level. It is easier to focus on the
semantics of data if it is represented in an object-oriented form
rather than a tree-like structure as in XSD. With our solution,
mapping rules, schemas and instances will all be in the objectoriented form.
Therefore, the core challenge is to generate the Target XML
document from the Source XML document, given the Source
XSD schema and the Target XSD schema. A Transformation
Layer is usually designed to address this challenge by providing
means to map the Source XSD to the Target XSD at design time,
and by providing an engine that implements the schema mappings
at run time when the Target XML needs to be generated from
Source XML.
In this paper, F-logic, together with its Flora2 implementation, is
used as the object-oriented language for formalizing schemas and
instances, as well as the mappings. Flora2 is a sophisticated
object-oriented knowledge base language and application
development environment platform [3]. Flora2 is implemented as
a set of run-time libraries and a complier that translate a unified
language of F-logic, HiLog and Transaction Logic into tabled
Prolog code. Figure 2 presents an example of Flora2 schemas and
objects description, rules and queries, as well as loading files into
modules. For example, in the specification of schemas ‘=>’ is
used to specify the types of the attributes of a class, ‘*’ is used for
inheritable attributes, in the specification of objects ‘->’ is used to
specify the values of the object’s attributes, ’>>Mod’ means load
a program into a module Mod (‘@Mod’ means query the value in
model Mod). The reader is referred to [3] for further details of the
syntax and semantics of F-logic/Flora2.
Since the transformation cannot be fully automated, the core
question is how to design the transformation layer in such a way
that the human intervention in the specification and execution of
mappings is kept at a minimum.
XSD is well known to be a complex language and designing
mappings between XSD schemas is nothing but a challenge. It is
our belief that the mapping designer should focus on the
mappings at the semantic level between the conceptual models
behind the XSD schemas that need to be mapped, rather than
having to deal with technicalities of XSD. Therefore, in the paper
we rely on the lifting of XSD schemas to more abstract, objectoriented models, and the specification of the mapping at this more
abstract layer. This will not only ease the specification of the
mappings by the mappings creator, but would also enable other
kind of schemas, not only XSDs, to be mapped to or from XSD
schemas. In this paper we chose F-logic – a rule-based objectoriented logical language – as the language to represent the
semantic models behind the XSD schemas. We use F-logic not
only for specifying the semantic models, but also for specifying
the mappings between them. Furthermore, the use of Flora2
engine 1 – a reasoning engine for F-logic [3] – allows us to
perform run-time mediation. In this way, we use F-logic/Flora2 as
a platform independent model according to the OMG MDA
architecture. We argue for two benefits of our approach to XML
data exchange:
1.
It allows the mappings creator to focus on the semantic,
object-oriented model behind the XSD schemas and
specify the mappings at a more abstract, semantic level,
rather than having to deal with technicalities of XSD
schemas.
2.
It allows both specification and execution of data
mappings (i.e. design- and run-time mapping) in a single,
unifying framework.
Class
Type
Schema description
person[name*=>string, children*=>person].
Object
Instance Relation
Object description:
John:person[name -> ‘John Doe’, children -> {Bob, Mary}]
Mary:person[name -> ’Mary Doe’, children -> {Alice}]
Rules:
?X:human :- ?X:person.
Queries:
Whose child is Bob in module Mod:
?X : person@Mod, ?X[name ->?Y, children->Bob]@Mod.
Output Result:
?X=’John’,?Y=’John Doe’
Loading programs in modules:
?- [‘path/filename.flr’>>Mod]
#include “path/filename.flr”
….
Figure 2. Flora2 examples: objects, rules, queries.
The remaining of this paper is organized as follows. Section 2
provides a brief introduction to F-logic/Flora2. Section 3 presents
our mapping approach for lifting of XSD schemas to objectoriented modes, mapping specification and run-time execution.
Section 4 provides an overview of the architecture of our data
exchange system together with some preliminary performance
results. Section 5 gives concludes this paper, together with some
relevant related work and potential extensions.
The core motivation for choosing Flora2 is that it is a rule based
object-oriented logical language which provides support for
flexible specification of schemas, instances, mapping rules, and at
the same time it can be used to execute mapping rules on instance
data. Flora2 comes with an XML package which supports loading
and parsing XSD/XML documents, converting them to sets of
Flora2 objects stored in user-specified Flora2 modules. It also
provides equivalent entities for XSD and XML, features that used
in our framework for data mediation.
2. BRIEF OVERVIEW OF FLORA2
In order to realize data mediation at a more abstract, semantic
level, we need a higher level of abstraction for the representation
of XML schemas and instances. Our approach is based on using
object-oriented representations to abstract XML schemas and
instances and then to perform mapping between a source and a
1
Attribute
3. MAPPING APPROACH
Our proposed solution called FloraMap which is based on logical
rules for specifying mappings at the schema level and executing
those mappings at the instance level. The choice for logical rules
is motivated by their declarative and procedural semantics,
making them a powerful tool for declaratively specifying and at
http://flora.sourceforge.net/
90
Design-time:
the same time executing mappings. Logical rules cannot work
directly with XSDs, and therefore proper abstraction mechanisms
need to be developed for abstracting XSD schemas, on top of
which mappings can be designed and executed. Our choice for
such abstractions is the use of object-oriented techniques for
representing XSD and XML, on top of which mapping rules can
be more easily specified.
Flora2
Schema
Design-time
Run-time
Target
XSD
Transform Engine
2.
Logical rules are used to specify the mappings between the
source Flora2 schemas and target Flora2 schemas.
3.
The Source XML is represented as Flora2 objects of the
source Flora2 schema
4.
Logical rules from step 2 are executed for the source Flora2
objects and target Flora2 objects are generated
5.
The target Flora2 objects are serialized in target XML
instances
The rest of this section will give an overview of how abstraction
is achieved (mapping XML schemas and instance to Flora2
representations), how mappings are specified and executed (i.e.
mapping Flora2 source objects to Flora2 target objects), and how
the resulting Flora2 objects are serialized in XML (i.e. mapping
Flora2 objects to XML instances).
Flora2
Schema
Semantic Mapping
(Specification and Execution)
Flora2
Objects
The Source XSD and Target XSD are represented as source
and target Flora2 object-oriented schemas.
Run-time:
Figure 3 below gives an overview of the mapping approach. We
can separate the mapping in two parts: Design-time and Run-time.
Source
XSD
1.
To exemplify these steps we will use the exchange of an XML
invoice between a company X (source) and a company Y (target).
The schemas of the invoices of companies X and Y are presented
in Figure 4, together with the following mappings:
Flora2
Objects
1.
Source
XML
Target
XML
Bizszam in source is the same as InvoiceNumber in target
2.
Bizkelt in source is the same as InvoiceDate in target
3.
City in source is the same as DeliveryAddress.city in target
4.
Zip in source is the same as DeliveryAddress.zip in target
5.
Street in source is the same as DeliveryAddress.street in
target
6.
AccDate in target is a concatenation of Ev in the source, a
delimiter, Kanyvho in the source, a delimiter, and the
string ’01’, i.e. AccDate = (Ev+‘_’+Kanyvho+‘_’+’01’)
Figure 3. Mapping Approach – Overview.
Target XSD: Company Y
<xs:element name="InvoiceCompanyY">
<xs:complexType>
<xs:sequence>
<xs:element name="InvoiceNumber" type="xs:string"/>
<xs:element name="AccDate" type="xs:string"/>
<xs:element name="InvoiceDate" type="xs:string"/>
<xs:element name="DeliveryAddress" minOccurs="0">
<xs:complexType>
<xs:sequence>
<xs:element name="city" type="xs:string" minOccurs="0"/>
<xs:element name="zip" type="xs:string" minOccurs="0"/>
<xs:element name="DoorNo" type="xs:string" minOccurs="0"/>
<xs:element name="street" type="xs:string" minOccurs="0"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
(b)
Source XSD: Company X
<xs:element name="InvoiceCompanyX">
<xs:complexType>
<xs:sequence>
<xs:element name="Bizszam" type="xs:string“/>
<xs:element name="Ev" type="xs:string“/>
<xs:element name="Kanyvho" type="xs:string“/>
<xs:element name="Bizkelt" type="xs:string“/>
<xs:element name="city" type="xs:string" minOccurs="0"/>
<xs:element name="zip" type="xs:int" minOccurs="0"/>
<xs:element name="street" type="xs:string" minOccurs="0"/>
</xs:sequence>
</xs:complexType>
</xs:element>
1
6
2
3
4
5
(a)
Figure 4. XML Schemas and mappings example.
91
XSD import and include have natural equivalents to Flora2
modules. For example, “filename.xsd” is included in XSD file
which is presented as <include schemaLocation="filename.xsd"/>.
It can be transformed as #include “filename_Abstract.flr” in
Flora2 Abstract file and #include “filename_Special.flr” in Flora2
Special file. For XSD import, the following steps can be used for
the mapping:
3.1 XSD2OO
The technique we designed for abstracting XML schemas to
object-oriented models will generate two Flora2 models for each
XSD: one Flora2 model (Abstract) contains the “clean”
conceptual model of the schema (without any technicalities of
XSD, but focusing on the semantics of the elements), and the
other one (Special) contains XSD specific information (sequence,
choice, etc.) which will be used for generating the structure of
target XML instances.
1.
In most cases, XSD elements can find a natural representation in
Flora2. For example, if a job element in XSD is specified as
<element
name=”job”
type=”string”
minOccurs=”0”,
maxOccurs=”5”>, it can be transform as in Flora2 as [job
{0:5}*=>string]. The {0:5} cardinality is equivalent to
minOccurs=”0” and maxOccurs=”5” in XSD.
[‘filename_Abstract.flr’>>namespace] in Flora2 abstract file
2.
[‘filename_Special.flr’>>namespace] in Flora2 special file
3.
Keep the element name and replace the “:” with “_” in the
type
Table 2 below exemplifies the way XSD import and include are
handled in Flora2 schemas.
Table 2. XSD contains import and include to Flora2 mapping.
Due to length restrictions, we do not provide the reader with a
complete mapping of XSD to Flora2 schemas. Nevertheless,
Table 1 provides three examples of how top-level elements in
XSD are mapped to Flora2 representations.
Situation 1
Table 1. Example of XSD elements to Flora2 schema mapping
Situation 1
Top-level Element with BaseType
XSD
Abstract
<element name=”name” type=”string”
maxOccurs=”2”/>
name[name {1:2} *=>string].
Special
none
Situation 2
Top-level Element with ComplexType
XSD
Abstract
Special
Situation 3
XSD
Abstract
Special
XSD
<element name=”name”>
<complexType>
<sequence>
<element name=”firstname” type=”string”/>
<element name=”lastname” type=”string”/>
</sequence>
</complexType>
</element>
name[firstname {1:1} *=>’string’].
name[lastname {1:1} *=> ‘string’].
Elements[name->firstname].
Elements[name->lastname].
Sequences[name->[firstname,lastname]].
Top-level Element with SimpleType
Abstract
Special
<element name=”age”>
<simpleType>
<restriction base="int">
<maxInclusive value="200"/>
</restriction>
</simpleType>
</element >
age[base *=>’int’].
age[maxInclusive-> 200].
none
Situation 2
XSD Import
<schema xmlns:ccts=“abcd">
<import namespace=“abcd"
schemaLocation="../Information.xsd"/>
<element name=”person”>
<complexType>
<sequence>
<element name=”name” type=”ccts:Type”/>
<element ref=”ccts:age”/>
<element name=“work”>
<complexType>
<simpleContent>
<extension base=“ccts:workType“/>
</simpleContent>
</complexType>
</element>
</sequence>
</complexType>
</element>
</schema>
?- [‘path/Information_ Abstract.flr’>>ccts]
person[name {1:1} *=> ccts_nameType].
person[‘ccts:age’ {1:1} *=> ccts_age].
person[work {1:1} *=> personwork].
personwork[‘ccts:workType’
{1:1}
ccts_workType].
?- [‘path/Information_Special.flr’>>ccts]
Elements[person -> name].
Elements[person -> ‘ccts:age’].
Elements[person -> work].
XSD
XSD Include
<include schemaLocation="person.xsd"/>
Abstract
#include “path/person_ Abstract.flr”
Special
#include “path/person_Special.flr”
*=>
The result of applying the XSD to Flora2 transformation to the
XSD schema of Company X (Figure 4.a) is depicted in Figure 5,
and the result of applying the transformation to the XSD schema
of Company Y (Figure 4.b) is depicted in Figure 6.
Attribute and Element are different things in XSD, but we abstract
them as the same in Abstract and identify the difference in Special.
92
For example, if an instance of a job element is represented in
XML as <job>Programmer</job>, then it can be transformed to
obj_1:person [job->’programmer’] in Flora2. Obj_1 is a unique
object name and obj_1:person means obj_1 is one of the instances
of person. To transform the XML instance to Flora2 objects, the
following high-lever steps are devised:
Flora2 Abstract (Company X)
Namespace[value->'xs:'].
InvoiceCompanyX [Bizszam{1:1}*=>'xs:string'].
InvoiceCompanyX [Ev{1:1}*=>'xs:string'].
InvoiceCompanyX
[Kanyvho{1:1}*=>'xs:string'].
InvoiceCompanyX [Bizkelt{1:1}*=>'xs:string'].
InvoiceCompanyX [city{0:*}*=>'xs:string'].
InvoiceCompanyX [zip{0:*}*=>'xs:int'].
InvoiceCompanyX [street{0:*}*=>'xs:string'].
1.
Flora2 Special (Company X)
Sequences[InvoiceCompanyX ->['Bizszam','Ev',
'Kanyvho',’Bizkelt','city','zip','street']]
.
Elements[InvoiceCompanyX ->Bizszam].
Elements[InvoiceCompanyX ->Ev].
Elements[InvoiceCompanyX ->Kanyvho].
Elements[InvoiceCompanyX ->Bizkelt].
Elements[InvoiceCompanyX ->city].
Elements[InvoiceCompanyX ->zip].
Parsing XML instance files in Flora2, resulting in a Flora2
tree.
2.
Load Flora2 Abstract source files in Flora2.
3.
Generate the Flora2 object structure according to the Flora2
abstract and query the value from Flora2 tree; Object names
are constructed by concatenating “obj_” + a unique number
(e.g. 1_1_2) generated from the unique location in the tree.
Step 1 is performed by Flora2 engine itself which is not part of
our implementation (Flora2 XML package provides XML parsing
support). It stores XML instances in Flora2 tree automatically
when XML files are parsed. FloraMap uses this package to load
XML file and uses Flora2 tree to query the value.
Step 2 and 3 are performed by FloraMap. FloraMap generates the
Flora2 objects structure according to the Flora2 Abstract and
queries the value from Flora2.
Figure 5. Flora2 schema representation of Company X XSD
schema (Figure 4.a)
Figure 7 shows the generation of a Flora2 object from an XML
instance example of Company X. On the upper part are X’s XML
instance and Flora2 Abstract. The output is the Flora2 object obj.
Flora2 Abstract (Company Y)
Namespace[value->'xs:'].
InvoiceCompanyY[InvoiceNumber{1:1}*=>'xs:string'].
InvoiceCompanyY[AccDate{1:1}*=>'xs:string'].
InvoiceCompanyY[InvoiceDate{1:1}*=>'xs:string'].
InvoiceCompanyY[DeliveryAddress{0:*}*=>
CompanyYDeliveryAddress].
CompanyYDeliveryAddress[city{0:*}*=>'xs:string'].
CompanyYDeliveryAddress[zip{0:*}*=>'xs:string'].
CompanyYDeliveryAddress[DoorNo{0:*}*=>'xs:string'].
CompanyYDeliveryAddress[street{0:*}*=>'xs:string'].
Source XML (Company X)
<InvoiceCompanyX>
<Bizszam>I_001</Bizszam>
<Ev>2010</Ev>
<Kanyvho>05</Kanyvho>
<Bizkelt>2010-05-18</Bizkelt>
<city>Oslo</city>
<zip>1234</zip>
<street>First Street</street>
</InvoiceCompanyX>
Flora2 Special (Company Y)
Sequences[InvoiceCompanyY ->['InvoiceNumber','AccDate',
'InvoiceDate‘ ,'DeliveryAddress',TheOrderEnd]].
Elements[InvoiceCompanyY ->InvoiceNumber].
Elements[InvoiceCompanyY ->AccDate].
Elements[InvoiceCompanyY ->InvoiceDate].
Elements[InvoiceCompanyY ->DeliveryAddress].
Sequences[CompanyYDeliveryAddress->['city','zip',
'DoorNo','street']].
Elements[CompanyYDeliveryAddress ->city].
Elements[CompanyYDeliveryAddress ->zip].
Elements[CompanyYDeliveryAddress ->DoorNo].
Elements[CompanyYDeliveryAddress ->street].
Flora2 Abstract
(Company X)
Flora2 object (Company X)
obj: InvoiceCompanyX ['Bizszam'->'I_001'].
obj: InvoiceCompanyX ['Ev'->'2010'].
obj: InvoiceCompanyX ['Kanyvho'->'05'].
obj: InvoiceCompanyX ['Bizkelt'->'2010-0518'].
obj: InvoiceCompanyX ['city'->'Oslo'].
obj: InvoiceCompanyX ['zip'->'1234'].
bj I
i C
X ['
' 'Fi S
']
Figure 6. Flora2 schema representation of Company Y XSD
schema (Figure 4.b)
Figure 7. XML to Flora2: Company X
3.3 OO2OO
These Flora2 Abstract and Special parts represent the source and
target XSD and will be used as input in the design-time mapping
and run-time target XML instance generation.
The core part of data mediation is the specification and execution
of the mappings in Flora2, process which takes as input the Flora2
Abstract schemas of the source and target and the mappings
between them, the Flora2 source objects, and generates Flora2
target objects according to the specification of the mappings. This
phase can be separated in three steps:
3.2 XML2OO
The technique we designed for abstracting XML instance to
object-oriented models will generate one Flora2 model. Flora2
provides natural equivalences between object entities and XML
instances.
1.
93
Specification of the design-time mappings between the
source and target Flora2 Abstract schemas.
2.
Generation of the executable (run-time) mappings from the
design-time specification of the mappings.
3.
Execution of the mappings on source Flora2 object for
generation of Flora2 target objects.
Flora2 Abstract
Company X
For step 1 we provide a simple mechanism to capture the
correspondences between the Flora2 Abstract source and target
schemas. This is achieved by the following Flora2 predicates:
Design-time Mappings:
Company X to Y
Flora2 Abstract
Company Y
?- [‘InvoiceCompanyX.flr'>>SourceInstances].
?-?h: CompanyX@SourceInstances,newoid{?t},newoid{?t_4},
insert{ ?t: InvoiceCompanyY[InvoiceNumber->?t_1],
?t: InvoiceCompanyY [AccDate->?t_2],
?t: InvoiceCompanyY [InvoiceDate->?t_3],
?t: InvoiceCompanyY [DeliveryAddress->?t_4],
?t_4: InvoiceCompanyYDeliveryAddress[city->?t_4_1],
?t_4: InvoiceCompanyYDeliveryAddress[zip->?t_4_2],
?t_4: InvoiceCompanyYDeliveryAddress[street->?t_4_4]
|
?t_1=?h.Bizszam@SourceInstances,
flora_concat_items([?h.Ev@SourceInstances,_,
?h.Kanyvho@SourceInstances,_01],?t_2)@_plg(flrporting),
?t_3=?h.Bizkelt@SourceInstances,
?t_4_1=?h.city@SourceInstances,
?t_4_2=?h.zip@SourceInstances,
?t_4_4=?h.street@SourceInstances}.
OneToOne([source],[target]).
OneToMany([source],[[target1],[target2],…],[n1,m1,n2,m2,..]).
ManyToOne([[source1], [source2], [source3],…],[target]).
OneToOne means that a class or attribute in the source schema
corresponds to a class or attribute in the target schema.
OneToMany means that a class or attribute in the source schema
corresponds to more than one class or attributes in the target.
ManyToOne means that more than one class or attribute in the
source schema correspond to one class or attribute in the target.
[source] is the path of the source class or attribute. [target] is the
path of the target class or attribute. [n1,m1,n2,m2…] are values to
identify substrings, first substring is from n1 to m1, second
substring is from n2 to m2 and so on.
Figure 9. Fora2 executable program (run-time mappings)
Figure
8
shows
the
Flora2
specification
of
correspondences/mappings between the Flora2 Abstract source
and target schemas from Figures 5 and 6, respectively. The
mapping information is taken from our running example in Figure
4.
In step 3, Flora2 system is used as the underlying reasoning
engine to execute the Flora2 program on source instances. Figure
10 shows the result of applying the executable mapping program
on an instance of Company X invoice (obj) and the resulting
instance of the Company Y invoice (obj1).
Flora2 source object (Company X)
OneToOne([InvoiceCompanyX],[ InvoiceCompanyY]).
OneToOne([InvoiceCompanyX,Bizszam],
[ InvoiceCompanyY,InvoiceNumber ]).
OneToOne([InvoiceCompanyX,Bizkelt],
[InvoiceCompanyY,InvoiceDate ]).
OneToOne([InvoiceCompanyX,City],
[InvoiceCompanyY,DeliveryAddress, city]).
OneToOne([InvoiceCompanyX,Zip],
[InvoiceCompanyY,DeliveryAddress, zip]).
OneToOne([InvoiceCompanyX,Street],
[InvoiceCompanyY,DeliveryAddress, street]).
ManyToOne([[InvoiceCompanyX,EV],‘_’,
[InvoiceCompanyX,KANYVHO],‘_’,‘01’]],
[InvoiceCompanyY, AccDate]).
obj: InvoiceCompanyX ['Bizszam'->'I_001'].
obj: InvoiceCompanyX ['Ev'->'2010'].
obj: InvoiceCompanyX ['Kanyvho'->'05'].
obj: InvoiceCompanyX ['Bizkelt'->'2010-05-18'].
obj: InvoiceCompanyX ['city'->'Oslo'].
obj: InvoiceCompanyX ['zip'->'1234'].
obj: InvoiceCompanyX ['street'->'First Street'].
Executable Mapping Program
(Fig 9)
Figure 8. Design-time correspondences between the Flora2
schemas of company X and Y
For step 2 we have devised a mechanism that takes as input the
Flora2 source and target schemas, the design-time
correspondences between them, and generates a Flora2 program
that represents the executable mappings. This can be achieved in
Flora2 in a rather intuitive and straightforward way: for each
object instances in source generate new objects (using the newoid
primitive defined in Flora2), assign the values to the new objects
according to the design-time correspondences rules, and store the
new objects in a target knowledge base (using the transactional
feature insert of Flora2). Figure 9 shows the generated executable
mapping program for our running example.
Flora2 target object (Company Y)
obj1: InvoiceCompanyY[InvoiceNumber->’I_001’].
obj1: InvoiceCompanyY [AccDate->'2010_05_01'].
obj1: InvoiceCompanyY[InvoiceDate->’ 2010-05-18'].
obj1: InvoiceCompanyY[DeliveryAddress->{obj_4}].
obj_4:CompanyYDeliveryAddress[city->’Oslo'].
obj_4:CompanyYDeliveryAddress[zip->‘1234'].
obj_4:CompanyYDeliveryAddress[street->‘First Street'].
Figure 10. Run-time mapping of Flora2 objects
94
At run-time FloraMap takes as input the XML source instances,
the Flora2 source and target schemas, and the executable mapping
rules produced at the design-time. Based on these inputs,
FloraMap transforms XML source instances to Flora2 objects,
executes the mappings on these source objects and generates
target objects, and finally serializes the target objects into XML
target instances.
3.4 OO2XML
Flora2 to XML mapping is the last process in FloraMap execution
and is concerned with serialization of generated Flora2 objects
into XML instances. This process takes as input the target schema
(both Flora2 Abstract and Special target schemas) and the Flora2
target objects and generates a target XML instances.
In the XSD to Flora2 lifting process, FloraMap generated two
Flora2 models: Flora2 Abstract (contains conceptual model of the
schema) and Flora2 Special (contains XSD specific information).
These two Flora2 files are used for generating the structure of
target XML instances. Note that the Flora2 Special target schema
plays a key role in the serialization of the objects, because it
indicates the technical details of the XML instance that should be
generated. In Flora2 to Flora2 mapping process, FloraMap
generated Flora2 objects which are used to query the values of
each class and attribute. Figure 11 depicts the Flora2 to XML
process in our running example.
Figure 12 presents a high-level overview of the FloraMap
modules and the interactions between them. The followings are
the core modules of FloraMap:
•
•
•
•
Flora2 object (Company Y)
obj1: InvoiceCompanyY[InvoiceNumber->’I_001’].
obj1: InvoiceCompanyY [AccDate->'2010_05_01'].
obj1: InvoiceCompanyY[InvoiceDate->’ 2010-05-18'].
obj1: InvoiceCompanyY[DeliveryAddress->{obj_4}].
obj_4: CompanyYDeliveryAddress[city->’Oslo'].
obj_4: CompanyYDeliveryAddress [zip->‘1234'].
obj_4: CompanyYDeliveryAddress [street->‘First Street'].
XSD to Flora2: Transforms the input XSDs to Flora2
schema models
XML to Flora2: Transforms the input XML instances to
Flora2 objects
Flora2 to Flora2: Specifies the mappings between the
source and target Flora2 models (OO level)
Flora2 to XML: serializes the Flora2 objects to XML
instances
Target
XSD
Source
XSD
XSD to Flora2
Source
Target
Flora2 Schema
Flora2 Schema
Target XML (Company Y)
Flora2 Abstract
(Company Y)
Flora2 Special
(Company Y)
<?xml version="1.0"?>
< InvoiceCompanyY >
<InvoiceNumber>I_001</InvoiceNumber>
<AccDate>2010_05_01</AccDate>
<InvoiceDate>2010-05-18</InvoiceDate>
<DeliveryAddress>
<city>Oslo</city>
<zip>1234</zip>
<DoorNo> </DoorNo>
<street>First Street </street>
</DeliveryAddress>
</ InvoiceCompanyY>
Flora2 to Flora2
Source
Target
Flora2 Objects
Flora2 Objects
XML to Flora2
Flora2 to XML
Source
XML
Figure 11. Serialization of Flora2 objects to XML instances
4. System Architecture, Implementation, and
Experimental Results
Target
XML
Figure 12. FloraMap: Core modules and interactions
Several experiments have been performed on the current
implementation to test the scalability of FloraMap. The
experiments have been carried out on a commodity computer
(Intel(R) Core(TM) 2 Duo CPU P8600 @ 2.4GHz, 4GB RAM,
Windows Vista 32-bit OS). Two types of experiments have been
performed:
The techniques outlined in the previous section have been
implemented in FloraMap - as a set of modules implemented in
Flora2 which can be used to parse and transform XML schemas
and instances into Flora2 schemas and objects, and execute the
mediation rules specified at the Flora2 level.
At design-time FloraMap takes as input the source and target
XML schemas and generates the object-oriented models of the
schemas. Then, the mappings creator specifies the
correspondences/mappings between the schemas (similar to the
example given in Figure 8), and generates the executable mapping
program (similar to the example given in Figure 9) that will be
used to execute mediation on source instances.
1.
Transformation of XSDs of various sizes and complexities
to Flora2 Schema.
2.
End-to-end data exchange of increasing number of instances
for the running example presented in above section.
For the first type of experiments we have used XSDs of various
sizes and complexities to test the scalability of generating Flora2
object-oriented models from XML schemas. The used XSDs
ranged from simple schemas such as those presented in this paper
95
(in Figure 4) to very complex schemas such as the Northern
European Subset of UBL (NES).2 The times needed to generate
object-oriented models from XSDs are reported in Figure 13.
5. Related Work, Conclusions, and Outlook
The problem of mapping between data structures has been
extensively studied for decades, and schema mapping is well
established as research field [6,2]. Nevertheless, the use of rulebased logical systems for data mapping/exchange hasn’t been yet
widely investigated in the community. With this paper we
provided a solution to the end-to-end data exchange problem
based on the use of F-logic/Flora2 as a logical framework which
we used for high-level, abstract specification of schemas and
mappings between them, as well as for run-time execution of
mappings. Our approach allows the mappings creator to focus on
the semantic, object-oriented model behind the XSD schemas and
specify the mappings at a more abstract, semantic level, rather
than having to deal with technicalities of XSD schemas. The
proposed approach allows both specification and execution of
data mappings (i.e. design- and run-time mapping) in a single,
unifying framework, providing an end-to-end solution to the
problem of XML data exchange.
Figure 13. Performance results: Generation of Flora2 models
from XML schemas
There are several works that can be related to our approach. For
example [4] presents algorithms to represent XML and XSD in a
mainstream object-oriented programming language. It develops
two mappings: one uses a set of rules that map an XSD schema
into its object-oriented schema, and the other one maps XML
instances that conform to an XSD schema to their representation
as objects. This is directly related to our generation of Flora2
object-oriented models from XML schemas and instances,
however, the representation in [4] does not seem to be complete
(e.g. it is unclear how XSD import/include statements are
handled). Furthermore, our approach targets specification of
mediation as well as run-time execution, whereas [4] focuses just
on an object-oriented representation of XML schemas. Another
relevant work is [5], which focuses on generation of XML from
object oriented modes. This can be related to our serialization of
Flora2 objects into XML, but as in the case of [5] the scope of our
work is much broader.
The results show that mapping large and complex schemas such
as NES is a time consuming task (took about 7 minutes), however
this is not an issue since this generation needs to be done at design
time and only once. After producing the Flora2 representations of
the XSDs, they can be loaded and processed rather fast by
FloraMap, for run-time mediation.
For the second type of experiments, where we tested the end-toend data exchange, we have used increased numbers of
synthetically generated instances of the source schema presented
in Figure 4, to generate instances of the target schema (also
presented in Figure 4). This type of experiment included the
complete mapping of source instances to target instances, through
an intermediary schema (not presented here), meaning that we
had three schemas and two set of mappings. The time needed to
have a complete transformation of increased numbers (1 to 4000)
of invoice instances of Company X XSD to instances of Company
Y XSD is reported in Figure 14.
In a wider context, the work presented in this paper is related to
MDE model transformation techniques and languages [7,8] such
as ATL Transformation Languages (ATL). 3 Whereas model
transformation languages can be applied to the XML data
exchange problem addressed in this paper, it is unclear how
suitable and easy is to apply such general purpose languages for
the specific case of XSD/XML. A thorough analysis of model
transformation techniques developed in the MDE community is
needed in order to judge their suitability for XML data exchange.
Furthermore, a systematic comparison of mode transformation
techniques and logical rule-based approaches for data exchange is
needed in order to understand their similarities and differences,
and have a clear understanding of their advantages and
disadvantages for data exchange.
Figure 14. Performance results: End-to-end data mediation
These results show that the larger the number of instances the
more time is needed for end-to-end processing, with the time
being somewhere between linear and exponential. Whereas in
some applications this can be acceptable (e.g. processing 4000
instances in about 15 minutes, as our results showed), in some
other applications this might not be reasonable.
The FloraMap mapping technique proposed in this paper is
promising, and its implementation and experiments showed that
run-time mediation is possible and feasible with a logic-based rule
approach. However, there are still some directions can be
considered to further enhance FloraMap:
2
3
http://www.nesubl.eu/
96
http://www.eclipse.org/atl/
1.
2.
3.
4.
5.
6.
Extensions for handling end-to-end n-m mappings between,
where multiple sources and multiple targets can exchange
data.
6. REFERENCES
[1] Christoph Bussler. B2B Integration. 2003, Springer, ISBN
3540434879.
Inconsistent mappings may lead to errors during the run-time
data exchange, therefore design and implementation of a
consistency check technique at design time would
significantly improve the mapping process. It is expected
that the underlying reasoning mechanism provided by Flogic will significantly contribute to the automated detection
of inconsistencies between mapping rules, and therefore
making logical rule based approaches even more attractive
for data exchange.
[2] Ken Smith, Peter Mork, Len Seligman, et al. The Role of
Schema Matching in Large Enterprises, CIDR Perspectives
2009.
[3] Guizhen Yang, Michael Kifer. FLORA-2: User’s Manual
2008.
[4] Suad Alagic, Philip A. Bernstein, Mapping XSD to OO
Schemas, Microsoft Research, 2008.
[5] R. Xiao, Tharam S. Dillon, E. Chang, Ling Feng. Modeling
and Transformation of Object-Oriented Conceptual Models
into XML Schema, Database and Expert Systems
Applications, 795-804.
Design and implementation of a graphical interface for
design-time mapping. In its current implementation,
FloraMap does not come with a graphical editor of Flora2
models and mappings. Reuse of open-source tools such as
the emerging in the context of the OpenII project4 could be
relevant in this context.
[6] Bernstein, P. A. and Melnik, S. Model management 2.0:
manipulating richer mappings. In Proceedings of the 2007
ACM SIGMOD international Conference on Management of
Data (Beijing, China, June 11 - 14, 2007).
FloraMap has been designed for XML data mapping,
however since the approach works at an expressive model
level, it should be fairly simple to extend it to handle other
types of schemas such as relational schemas. This would
enable exchange of data that conform to different schematic
representation, e.g. relational, XML schemas, etc.
[7]
[8] Czarnecki, K, and Helsen, S. Classification of Model
Transformation Approaches. In: Proceedings of the
OOPSLA'03 Workshop on the Generative Techniques in the
Context Of Model-Driven Architecture, Anaheim,
California, USA.
(Semi-)Automated generation of executable mapping rules.
Approaches for automated generation of rules in the area of
ontology and MDE model transformation techniques such as
[9,10], as well ideas from semantic Web services
matchmaking such as [11], can be employed here to provide
sophisticated support for a (semi-) automated generation of
mapping rules.
[9] Stephan Roser, Bernhard Bauer. Automatic Generation and
Evolution of Model Transformations Using Ontology
Engineering Space. J. Data Semantics 11: 32-64 (2008).
[10] Gerti Kappel, Elisabeth Kapsammer, Horst Kargl, Gerhard
Kramler, Thomas Reiter, Werner Retschitzegger, Wieland
Schwinger, Manuel Wimmer: Lifting Metamodels to
Ontologies: A Step to the Semantic Integration of Modeling
Languages. MoDELS 2006: 528-542.
More comprehensive validation. Whereas we provided some
initial experimental results for the scalability of FloraMap,
other aspects of our approach need to be analyzed in a more
systematic way. For example, analyzing the complexity of
the specification of mapping rules, compared for example to
the complexity of the specification of mapping rules using
model transformation techniques would be another potential
direction for future work.
[11] Klusch, M. and Kaufer, F. WSMO-MX: A hybrid Semantic
Web service matchmaker. Web Intelli. and Agent Sys. 7, 1
(Jan. 2009), 23-42.
ACKNOWLEDGMENTS
This work is partly funded by the EU projects “A Semantic
Service-oriented Private Adaptation Layer Enabling the Next
Generation, Interoperable and Easy-to-Integrate Software
Products of European Software SMEs (EMPOWER)” 5 and
“Environmental Services Infrastructure with Ontologies
(ENVISION)” 6.
4
http://www.openintegration.org/
5
http://empower-project.eu/
6
http://www.envision-project.eu/
Mens, T, and Van Gorp, P. A Taxonomy of Model
Transformation, Electronic Notes in Theoretical Computer
Science, Volume 152, 27 March 2006, Pages 125-142.
97
Behavioural Interoperability to Support Model-Driven
Systems Integration
Alek Radjenovic
Richard F Paige
The University of York
Department of Computer Science
York YO10DD, United Kingdom
+44 1904 567836
The University of York
Department of Computer Science
York YO10DD, United Kingdom
+44 1904 343242
alek@cs.york.ac.uk
paige@cs.york.ac.uk
ABSTRACT
1. INTRODUCTION
Software system integration is a process in which the target
system is synthesised from discrete components (subsystems)
whilst ensuring they function together as a system and are able to
deliver required functionality. System integration is particularly
important in projects in which new technologies must integrate
with legacy systems. In such scenarios, this process can be
broadly divided in two stages: interoperability checking and
composition. Model-based approaches are promising since they
allow us to carry out some of this process earlier (thus identifying
problems earlier in the development lifecycle when they are
easier to rectify). In this paper we describe a generic modelbased platform for system integration, applicable to different
modelling languages, that supports both interoperability checking
(at different levels of abstraction) and composition; our
presentation focuses on the platform’s support for
interoperability checking. The approach, which consists of a
language and a simulation tool, is presented, and its use is
illustrated in a simple example for interoperability checking
involving architectural models enriched with behaviour.
Software system integration is a process in which the target
system is synthesised from discrete components (subsystems)
whilst ensuring they function together as a system and are able to
deliver their intended functionality. These software elements are
typically developed separately. Indeed, many software-intensive
and software-dependent projects, whilst taking advantages of the
next generation technologies as well as ‘ready-made’ third party
components, are required to reuse existing legacy software. In
such scenarios, integration introduces risk because the
interoperability between various parts cannot be ascertained
before late stages in the development process (i.e. during the
system integration phase).
Modern software projects often use model-based development
(employing various modelling platforms and notations) where
models are created prior to the development of executable code.
Even when models are not available (e.g., in legacy systems),
system architects can use tools to generate component and
architectural models automatically from source code.
Increasingly, component models are described using
heterogeneous modelling languages and tools. Thus, there is a
substantive technical problem to be addressed in model
integration. We argue that the identification of model integration
mechanisms at the software architecture level is highly desirable.
In particular, interoperability checking at the model level is key
to identifying system integration problems early on.
Categories and Subject Descriptors
I.6.4 [Simulation and Modeling]: Model Validation and
Analysis
General Terms
Design, Verification.
Interoperability checking represents the necessary first step in
model integration. The incompatibilities may arise in two
different planes - structural (mainly observed at the syntax level)
and behavioural (mainly observed at the semantics level). Our
framework tries to address both of these issues. Many current
approaches focus on one or the other, or are not sufficiently
generic to support all modelling languages as they focus on
specific standards, such as those of the OMG.
Keywords
Model analysis, model integration, model consistency, behaviour
modelling, simulation.
Our solution, which we call SMILE, is a framework within
which we can (amongst other things) attach semantics, relevant
to behaviour, to various structural model elements and perform
execution of the specified behavioural model through simulation.
Consequently, we are able to identify undesired behaviours of the
combined models either through post-simulation analysis of the
simulation trace or, actively, by formulating undesired conditions
which cause the simulation to halt if they have been detected.
SMILE is a platform capable of manipulating models specified in
different modelling languages and checking different behavioural
paradigms. We achieve this by means of transformation and
98
techniques, however, are defined at the metamodel level (various
taxonomies can be found in [7]).
simulation. First, the relevant behavioural information from the
input models is extracted to create a SMILE behavioural model
comprising behavioural types. And second, these types are
instantiated as simulation objects used in the simulation. Thus,
SMILE is essentially an interchange platform for exploring
behaviours in combined models. Although SMILE is a generic
platform, applicable to arbitrary modelling languages, in this
paper we illustrate the principles behind it using UML and its
State Machine diagrams and use this exemplar to show how it
can be used in interoperability checking.
In terms of breadth of usage, three of the more successful
model transformation approaches are ATL, ETL and VIATRA.
ATL (ATLAS Transformation Language) [13] is a model
transformation language and toolkit which provides a means to
produce a set of target models from a set of source models.
Developed on top of the Eclipse platform, the ATL Integrated
Environment (IDE) provides a number of standard development
tools (syntax highlighting, debugger, etc.) that aims to ease
development of ATL transformations. ATL also includes a
library of ATL transformations and has been defined to perform
general transformations within the MDA framework. There are
currently over 100 defined transformations in the online library.
The language itself appears to be somewhat cumbersome which
is reflected in the supplied transformation examples. This may
partly be due to its substantially declarative nature, because some
transformations are not necessarily best expressed in this fashion
(e.g., transformations that involve iterations over complex
structures). However, ATL's tool support is some of the most
robust in the MDE community.
The rest of the text is organised as follows. Section 2 describes
the related work. Section 3 provides an overview of our
approach. Section 4 first introduces the case study and then
presents the compatibility checking results. In closing, section 5
makes conclusions and suggests future directions.
2. RELATED WORK
2.1 Model Compatibility
Various organizations and companies (OMG, IBM,
Microsoft, etc.) have proposed environments to support Model
Driven Engineering (MDE). Among these, the OMG MDA
(Model Driven Architecture) [22] is most prominent, and it
focuses on the identification of basic MDE principles, its
practical characteristics (direct representation, automation, and
open standards), original scenarios, and discussions of suitable
tools and methods. System functionality is defined as a platformindependent model (PIM) through an appropriate domainspecific language (DSL). Given a platform definition model
(PDM) corresponding to a particular software technology (such
as, CORBA [25], or .NET [20]), the PIM is then translated to one
or more platform-specific models (PSMs) for the actual
implementation. One of the key obstacles to model-based
interoperability and hence system integration is the
incompatibility of models evident mainly at the syntactical level.
In order to resolve this issue it was suggested that a unifying
meta-model, to which all modelling languages concerned would
conform, would be required. OMG's UML profile for Enterprise
Application Integration (EAI) is defined as a complete MOFbased [27] metamodel that provides facilities for modelling the
integration architecture, focusing on connectivity, composition
and behaviour. The EAI UML profile also defines a MOF-based
standardised data format intended for use by different systems to
exchange data during integration. Data exchange is achieved by
defining an EAI application metamodel that handles interfaces
and metamodels for programming languages (such as C, C++,
PL/I, and COBOL) to aid the automation of transformation.
While standardising on MOF is a step in the right direction, in
practice there are various problems, such as the lack of
widespread support for MOF by various tools, and the
differences between versions of XML Metadata Interchange
(XMI) [26] support in tools [3]. MOF is currently the only
standard that attempts to cut across the different modelling and
implementation platforms.
ETL (Epsilon Transformation Language) [14] provides
model-to-model transformation capabilities to Epsilon and can be
used to transform an arbitrary number of input models into an
arbitrary number of output models specified in different
modelling languages. ETL, like ATL [13] and QVT [24], has a
mixture of both declarative and imperative language
characteristics. Declarative transformation languages are
generally limited to scenarios where the source and target metamodels are similar to each other in terms of structure and thus,
the transformation is a matter of a simple mapping. Imperative
languages, in addition, include operations but operate at a low
abstraction level. Subsequently, users have to manually address
issues such as tracing and resolving target elements from their
source counterparts and orchestrating the transformation
execution. Hybrid languages provide both a declarative rulebased execution scheme as well as imperative features for
handling complex transformation scenarios. ETL is firmly in the
hybrid camp, and thus targets both mapping transformations
(where the source/target metamodels are similar) as well as more
complex transformation scenarios. Like ATL and QVT, ETL
reuses a portion of OCL for navigating model elements. Unlike
ATL and QVT, ETL includes imperative constructs (such as
loops, assignment statements, and sequencing of statements) that
makes iterative transformations much easier to express.
The VIATRA (VIsual Automated model TRAnsformations)
[9] framework is the core of a transformation-based verification
and validation environment for improving the quality of systems
designed using UML by automatically checking consistency,
completeness, and dependability requirements. Its main objective
is to provide a general-purpose support for the entire life-cycle of
engineering model transformations including the specification,
design, execution, validation and maintenance of transformations
within and between various modelling languages and domains.
VIATRA intends to complement existing model transformation
frameworks in providing: a model space for uniform
representation of models and meta-models, a transformation
2.2 Model Transformation
In an ideal situation, during model transformation the syntax
is changed to the target modelling language whilst the semantics
is preserved [12]. The overall majority of model transformation
99
Platform Independent Model (PIM) is merged with a Platform
Definition Model (PDM) (which contributes platform-specific
aspects) to form a Platform Specific Model (PSM). This has been
particularly useful for, e.g., performance analysis where different
system configurations (corresponding to platform-specific
performance data) have been merged with system models, and
the result has been used for simulation. When combined with
other features of the Epsilon platform, this merging capability
can be carried out iteratively, thus allowing batch-generation of
arbitrarily large numbers of simulation models and simulation
results. EML has also been used successfully for managing
versions of models.
language (with both declarative and imperative features; also,
based on popular formal mathematical techniques of graph
transformation (GT) and abstract state machines (ASM)), a high
performance transformation engine (which supports incremental
model
transformations, but also trigger-driven live
transformations where complex model changes may trigger
execution of transformations), and with main target application
domains in both model-based tool integration framework as well
as model analysis transformations. More importantly, VIATRA
considers scalability as an important factor and is claiming to be
able to handle well over 100,000 model elements.
2.3 Model Composition
openArchitectureWare (oAW) [1] is a modular MDA
generator framework implemented in Java, based on the Eclipse
platform. oAW can parse arbitrary models, and it has a family of
languages to check and transform models as well as generate
code from them. oAW has strong support for EMF (Eclipse
Modelling Framework) based models but can work with other
models, too (e.g. UML2, XML or simple JavaBeans). oAW is
based around a workflow engine which allows the definition of
generator mechanism. Various pre-built workflow components
can be used to read and instantiate models, check for constraint
violations, perform transformation into other models, or generate
code. openArchitectureWare have also submitted to Eclipse a
project proposal called Textual Modeling Framework (TMF).
TMF focuses on textual DSLs and Eclipse IDE integration. One
of two initial contributions will be Xtext - a framework and
generator to provide a specialised Eclipse editor and an EMF
meta-model from a simple EBNF-style grammar. Its focus will be
on very short turnarounds and it is hoped to provide powerful
abstractions for development of textual DSLs.
The process of model composition consists of four distinct
phases: comparison, conformance checking, merging and
reconciliation (or restructuring) [5,28]. In the comparison phase,
the correspondences between equivalent elements of the two
models are identified making sure that the elements in question
are duplicated in the merged model. In the conformance checking
phase, matched elements from the previous phase are examined
for conformance with each other in order to identify potential
conflicts that would render merging impossible. The majority of
proposed approaches (e.g., [18]) address conformance checking
of models through their compliance with the same meta-model.
In the merging phase, a number of approaches have been
proposed, such as graph-based algorithms [19,28] or an
interactive process for merging of UML 2.0 models [18]. The
limitations of these approaches are related to the fact that they
either only address merging models of the same (specific) metamodel, or use an inflexible merging algorithm with no means of
extension or customisation. In the reconciliation and
restructuring phase, the inconsistencies in the target model are
fixed.
2.4 Multi-paradigm Modelling
Next, some of the key approaches to model composition are
described.
Multi-paradigm modelling (MPM) combines three
orthogonal research fields: multi-formalism modelling (using
different languages while modelling a system), model
abstraction (relationship between models at different levels of
abstraction), and metamodelling (construction of the collection of
concepts that highlight the properties of the modelling language)
[29]. The advocates of MPM recognise that the design of systems
increasingly requires representations in various languages
(formalisms) and with different abstractions, where these
representations must be “coupled, combined, integrated, and
transformed” [33].
The AMW (ATLAS Model Weaver) [2] is a tool for
establishing relationships between models. These links are
stored in a weaving model which conforms to a weaving metamodel. Weaving models may be used in several application
scenarios, such as meta-model comparison, traceability, model
matching, model annotation, tool interoperability. AMW
provides a base weaving meta-model (enabling creation of links
between model elements and associations between those links)
which may be extended to add further mapping semantics
providing a mechanism for creating variable mapping languages
dedicated to specific application requirements.
In [11], the authors explore various multi-paradigm
modelling techniques and evaluate them based on two criteria: 1)
their level of support for an open set of modelling languages, and
2) their support for formal verification of properties. With
respect to the first criterion, they make three key conclusions.
Firstly, the platforms under observation (GME [8], Eclipse
Modeling Project [10], and AToM3 [17]) allow the automatic
generation of tool support for user-defined modelling languages
but the limitation is their dependency on the underlying
metamodels. Secondly, the composition of modelling languages
is highly dependent on the syntax and semantics being expressed
in a given format. And thirdly, the task of adding support for an
additional modelling language can be very difficult (the order of
magnitude corresponds to describing the semantics of a
The EML (Epsilon Merging Language) [14] adds model
merging capabilities to the Epsilon platform. More specifically,
EML can be used to merge an arbitrary number of input models
of potentially diverse metamodels and modelling technologies.
The key motivation for EML was to have a mechanism that
would enable automatic model merging on a set of established
correspondences. This has a number of applications in MDE. For
example, EML can be used to unify two complementary, but
potentially overlapping, models that describe different views of
the same system. As well, EML can be used to merge a core
model with an aspect model (section 4.5) (potentially conforming
to different meta-models). This is discussed in [21] where a core
100
or one or more of the OMG standards. Consequently, checking of
model interoperability, particularly at the behavioural level, is
often too dependent on, or skewed towards, Java, Ecore and/or
MOF. An approach in which we could reason about model
behaviours and model interoperability in a generic fashion, away
from the underlying meta-models, is highly desirable.
modelling language).With respect to the second criterion, the
authors found that to reason about properties – at a global level
and on a set of heterogeneous models – in a formal fashion
represents quite a challenge.
AToM3 [16] is a tool which received much attention from
the research arena. It implements model transformation
techniques based on graph rewriting. Here, input models are
represented (internally) using graphs while the transformations
are specified by graph grammars which spell out the rewriting
rules. The authors claim that AToM3 can potentially support a
wide range of modelling languages, provided that their abstract
syntax is described by a metamodel and that a transformation can
be written between the source and target metamodels. This may
be particularly difficult for certain types of languages [11]. The
key limitation is that the number of transformations that one
needs to design increases exponentially with the number of
participating languages.
3. THE SMILE-X PLATFORM
Our platform for model integration is called SMILE (Simple
Model Integration Language with Execution engine). It
comprises the techniques, languages and tools. SMILE has
effectively three main components two of which deal with
different aspects of model compatibility – SMILE-S (for
structural checking) and SMILE-X (for behavioural checking) –
whilst the third, SMILE-I, deals with integration.
We have initially explored the compatibility of models at
the structural level. SMILE-S defines an interchange format for
describing the structure of heterogeneous models in terms of
trees [31], where the tree vertices (nodes) represent structural
elements and the edges express the containment relationship
between the elements. In addition, the nodes typically contain a
collection of properties to further describe characteristics of the
structural elements. The concrete syntax is effectively a DSL
(domain specific language) that provides a way to represent
heterogeneous input models (e.g. UML, Simulink, and AADL as
shown in Figure 2) in a uniform fashion. The transformation of
input models into SMILE trees is external to the core tool, i.e.
the knowledge of the underlying meta-models and parsing is
delegated to plug-in components.
Other approaches either lack in mature tool support (e.g.
Rosetta [15]) or the supported semantics to describe behaviours
is of a limited range (e.g. BIP [4] uses labelled transition
systems).
2.5 Model Interoperability
The general notion of interoperability between systems is
defined in [30] as the ability of one system to communicate with
and access the functionality of the other system. The concept of
interoperability can also be characterised as a certain degree of
compatibility [6] where the levels of compatibility include
coexistence, interconnection, interworking, interoperation and
interchangeability, while the relevant system features that define
the compatibility level comprise communication protocols and
interfaces, data access and types, parameter semantics,
application functionality and dynamic behaviour.
SMILE-X is a natural extension of SMILE-S, and is the
component described in this paper.
DSL (structural models)
Furthermore, two or more models are interoperable if they
are related to one another in one of the following ways:
•
•
•
UML
SM1
DSL (behavioural models)
T1
e.g. state machines
Integrated – diverse models are interpreted in the
standard format which must be as rich as any of the
constituent system models
Unified –there is a common meta-level structure across
constituent models which provides a way to establish
semantic equivalence
Simulink
SM2
T2
AADL
SM3
T3
SIMULATION
(*) T1, T2, T3 ß transformations
Federated – models have to be dynamically
accommodated rather than have a predetermined
metamodel (this assumes that concept mapping is done
at the semantic level)
templates applied to structural models
Figure 1.SMILE: conceptual approach
In SMILE-X, we ignore the structural incompatibilities of
input models, such as the name and type mismatches
(identification of which is dealt within SMILE-S as part of the
structural compatibility checking) and focus solely on the
behavioural properties. A particular behavioural model (such as
the state machines) is provided as a template that enables model
transformations (i.e. mappings from specific SMILE-S elements
to a SMILE-X behavioural model – essentially a set of behaviour
types). As shown in Figure 2, SMILE-X descriptions are yet
another DSL that enable uniform representation of behavioural
models for the SMILE-X simulation engine. In SMILE-X, we
Such view clarifies the difference between full integration
and interoperability: integrated systems are interoperable while
the interoperable ones are not necessarily integrated.
2.6 Summary
A large majority of the existing approaches to model-based
system integration lack in one or more of the ‘ingredients’ we
discussed in this section. Even those which support the more or
less full set of model management techniques are typically tightly
integrated with either the Eclipse Modeling Framework (EMF)
101
mechanisms. A fully bidirectional platform capable of
transforming the analysed models back to the original format is
being implemented in the SMILE-I component.
neither attempt to resolve any inconsistencies or
incompatibilities (i.e. we merely identify them and report back to
the SMILE-X user) nor do we deal with behaviour preservation.
These are dealt with inside the integration (SMILE-I) component
of the platform.
The SMILE-X tool depends on a specific SMILE-X
language whose concrete syntax is in XML and conforms to a
well defined XML Schema [32]. The tool is essentially an
execution engine for the SMILE-X models and it also provides a
capability to add transformation plug-ins for interchange with
other modelling languages.
In this paper, we look at the scenario of homogeneous, but
vendor specific models. In particular, we use models specified in
different versions of UML using different UML tools. This
scenario is commonly present in projects of large organisations
where various software components are typically a mixture of
legacy code, new code, and third party (supply chain), off-theshelf, components.
SMILE-X architecture is illustrated in Figure 3. Two or
more input models are converted to SMILE trees (this
functionality is part of the SMILE-S component). A behavioural
template is then used to extract the relevant information from the
trees and create a behavioural model (e.g. state machines) which
essentially consists of a set of types that describe particular
behaviours of model components. Next, a configuration is
applied by means of a manual intervention and with the help of
other artefacts (such as class and object diagrams) in order to
instantiate the behavioural types into simulation objects and to
create a simulation model. Finally, we define or select a specific
schedule before we can perform simulation. Each simulation run
provides a trace as an output. We can then analyse the trace in
order to identify undesired behaviours in the system.
Alternatively, by formulating undesired conditions through the
definition of triggers we can cause the simulation to halt as soon
as these conditions are detected.
SMILE-X provides a framework where elements of input
models can be mapped to (or matched against) the specified
behavioural model. This behavioural model is provided in the
form of a template (meta-model) which enables us to attach
semantics to structural model elements and which describes a
particular behavioural paradigm (or, a related family of
behaviours) that we are interested in analysing. The chosen
template transforms the input models into an integrated SMILEX model which describes system's behaviour in the form of a
collection of elements and maps that convey information about
interactions within the system. SMILE-X transparently glues
elements together either fully automatically or with additional
information entered interactively by the user. Thus, SMILE-X
facilitates a mechanism through which we can integrate
behaviours of input models based on the chosen perspective, and
consequently perform simulations on the integrated system.
BEHAVIOUR
TEMPLATE
MAP
The template used in this paper is that of state machines
which takes an approach that is based on a modified discrete
event system specification (DEVS) [23]. The fact that there is a
significant overlap between behavioural models (such as
sequence, communication and state machine diagrams) on one
hand, and structural models (e.g. class, object, or component
diagrams) on the other, enables us to have an uncomplicated
extension to the work on structural compatibility in models. The
structural elements are further enhanced with concepts that add
semantics, such as: event, time, action and state. Consequently,
state machine models in SMILE-X are described in terms of a
well-known state transition system with actions and guards. The
behaviours specified must be regarded as specifications of the
actual behaviours which can be both deterministic as well as
non-deterministic. These behaviours are characterised by state
variables whose evolution is specified by transitions. The
transitions are triggered by events, guarded by conditions, and
enriched by actions. However, we reiterate that SMILE-X is not
restricted to state machine models; these are used here only as
one example.
CONFIGURATION
BEHAVIOURAL
MODEL
INSTANTIATE
SIMULATION
MODEL
SMILE
TREES
TRACE
SIMULATE
SCHEDULE
There are two key parts to the SMILE-X language
specification. The first part (the behavioural templates) enables
generic descriptions of the input models' behaviours. The second
helps in defining the rules (as triggers) which aid in detection of
the behavioural incompatibilities. SMILE-X builds on SMILE-S
by reusing all structural component declarations and adding
semantics in terms of a 'behavioural layer' to the specification.
The behavioural templates also enable the user to specify
sequential execution behaviour (so, UML sequence,
communication and state machine diagrams can all be input to
the SMILE-X tool).
TRIGGERS
Figure 2. SMILE-X architecture
Fundamentally, SMILE-X is designed to be a model
interchange format. The interchange is one way - from the native
(source) models (e.g. UML) to SMILE-X models – because we
focus only on detecting incompatibilities and not the integration
102
SMILE-X not only allows for behaviour descriptions to be
attached to instances of types (objects) but also to types, in which
case all instances of any such types will behave the same and
according to the provided specification. The users of SMILE-X
can also specify whether a particular behaviour description
specified on a type is also to be applied to any types that are
descended from that type (the so called 'loose mode') or just that
particular type ('strict mode'). In the case of loose mode, if a
descendant type has its own specific behaviour description
attached to it, then that description overrides the inherited
behaviour from the super-type.
•
this case study, the infrared beam component is not
modelled but it is assumed that the door will close after
a fixed period of time after it has been opened.
When a lift has no requests, it remains at its current
floor with its door closed.
4.2 Uses cases
There are two main use cases. ‘Calling a lift’ use case
describes the following scenario: (1) Passenger presses floor
button; (2) Lift system detects floor button pressed; (3) Lift
moves to the floor; (4) Lift doors open. ‘Travelling in a lift’ use
case consists of the following sequence of events: (1) Passenger
gets in and presses a lift button; (2) Lift system detects lift button
pressed; (3) Lift closes the doors if they are open; (4) Lift travels
to the required floor; (5) Lift doors open; (6) Passenger gets out;
(7) Lift doors close.
4.3 Class diagram
The system class diagram is presented in Figure 5. The
Controller class represents the lift system and there is a single
instance of this class in the target software system. This class
controls directly one or more Lift objects and two or more Floor
objects. The Floor object has one or two FloorButton objects as
explain in the introduction to the chapter. The floor also has one
or more Doors (depending on the number of lifts in the building.
Each Lift object has two or more LiftButton objects. The
FloorButton and LiftButton classes share common features
embodied in their Button superclass.
Figure 3. The state machine behavioural template (UML
class diagram)
SMILE-X also supports concurrency - i.e. multiple threads
of execution or multiple devices.
The state machine template adopted is well known,
describing a set of modelling artefacts sufficient to describe the
behaviour of a model representing a reactive software system. It
is depicted in the form a UML class diagram in Figure 4,
showing key classes and the relationships between them. The
Element class is a SMILE-S structural meta-model class with
which the described behaviour is associated.
4.4 Sequence and state machine diagrams
The system has two sequence diagrams which correspond to
the two use cases explained above. Each class also has a separate
state machine diagram (apart from FloorButton and LiftButton
which inherit their behaviour from the Button class). Due to
space limitations these diagrams are not presented here but some
are illustrated in the section on Compatibility Checking (below).
4. EXAMPLE
4.1 Case Study
This case study describes a real-time software application
which is installed to control M lifts in a building with N floors.
The problem concerns the logic required to move lifts between
floors according to the following constraints:
•
•
•
Each lift has a set of N buttons, one for each floor.
These illuminate when pressed and cause the lift to
visit the corresponding floor. The illumination is
cancelled when the lift visits the corresponding floor
Each floor has one (top and bottom floors) or two (all
other floors - to indicate the intended direction of
travel: up or down) floor buttons to request the lift to
come to the floor. The button illuminates when
pressed. The illumination is cancelled when a lift visits
the floor.
Figure 4. Lift system (UML class diagram)
4.5 The development process
Upon the arrival of the lift to any floor, the door opens
and remains open for a fixed period of time after the
infrared beam has last been cut by people or objects
moving in and out of the lift. After the expiry of that
fixed period of time, the door automatically closes. In
It is assumed that the behaviour of each of the five main
classes – Lift, Button, Floor, Door, and Controller – is specified
by a different team, using different UML tools which use
different UML versions. This is an attempt to replicate a real-
103
the expiry of a door timer instructing an open door to
automatically close after a fixed period of time (for example, 5
seconds after the door infrared beam was last cut indicating that
no more people or objects are going in our out of the lift).
world software lifecycle where the development is distributed
and the tools and platforms are potentially heterogeneous.
4.6 Compatibility checking
As explained earlier, compatibility checking of behaviours
in models in SMILE-X is performed through simulation, which
has three key concerns. First, we have to ensure that we have
selected relevant key characteristics and behaviours, and that the
source information acquired is valid. The second concern is the
use of simplifying approximations and assumptions typically
used in the process of abstraction. Finally, we must have high
level of confidence in the simulation outcomes in terms of their
trustworthiness and validity.
Door is an uncomplicated class (as shown in Figure 5). Its
public property Closed indicates whether the door is open or
closed. The Controller’s state machine coordinates the operation
of the Lift objects and the Floor objects (which in turn control the
opening and closing of the doors) by sending appropriate
messages (such as MOVE, HALT to a Lift object, and
LIFT_ARRIVED to a Floor object). If a MOVE message was
sent to the Lift while the door was still open (i.e. before the
AUTO_CLOSE event occurred), then this would represent a
hazardous scenario.
Our approach is that we deliberately assume that the first
two conditions stated above are satisfied. By focusing solely on
the third and by identifying incompatibilities, we are able to
demonstrate that both the system characteristics used in
behaviour descriptions, as well as the approximations used in
modelling, are inaccurate and need readjustment.
The criterion we used in detecting behavioural
incompatibilities was incorrect and/or unpredictable behaviour.
From a state machine perspective, this means that:
•
•
•
all states can be (have been) reached during the
simulation run
all state combinations are valid (i.e. invalid state
combinations do not occur)
•
all events have been used ('fired') at least once
•
all actions are performed successfully
•
•
•
Figure 5. Explicit compound state
The purpose of these definitions is to ensure that a
particular set of components within the system are not
concurrently in states which are mutually prohibiting. As an
example extracted from the case study, we declare an explicit
compound state (Figure 6) to ensure that the lift is not moving at
the same time that one of its floor doors is open.
all guard conditions are satisfied at least once
there aren't subsystems that are disconnected from the
rest of the system
there aren't deadlocks (multiple objects waiting for a
resource simultaneously and thus preventing a state
change)
relevant properties hold true
In our tool, many operations are done automatically at first
(e.g. model parsing), but some manual assistance is often needed
(e.g. mapping of elements to state machines, or definition of
element dependencies with respect to model behaviour) where an
interactive process is employed. Next, we describe each type of
behaviour incompatibility detected, providing an example related
to the case study above.
Figure 6. Door state machine
The definition of the compound state DoorsLeftOpen
(Figure 6) enables the detection of such scenarios. An addition of
a simple guard condition in the Controller state machine would,
for example, rectify this design fault.
4.7 Invalid state combinations
We define the compound state as the union of the current
state of one element and the current states of all its sub-elements
within the structural component hierarchy. This is not to be
confused with the UML definition of a composite state which is
different and more complex. We also define the concept of an
explicit compound state as a union of current states of an
arbitrary (user-defined) set of elements.
4.8 Unused events
Another type of analysis which reveals behavioural
incompatibilities is the search for unused events. This is
achieved by analysing the trace obtained from a simulation run,
either by manually inspecting it or by applying a search filter.
A simplified version of the Door state machine (without
guard conditions or actions) is illustrated in Figure 7. The
AUTO_CLOSE event is an internal timed event generated upon
For example, two state machines are shown in Figure 8 and
Figure 9 (Floor and Controller). When a passenger wishes to call
104
error in design. Whatever the case, the occurrence of such
circumstances requires design modification. We have just
demonstrated how an event (LIFT_REQUEST) may never be
used. The Controller state machine in Figure 9 was purposely
simplified. In particular, the Active state can be made more
specific by introducing substates (e.g. NoRequests and
RequestsPending, as in Figure 10). Assuming a successful power
up, Controller is in Active state and NoRequests substate. In this
scenario, a LIFT_REQUEST event is processed in the context of
the current substate. Assuming that any guards are satisfied and
any declared actions performed, a QUEUE event is generated
(internally to the Controller subsystem) and the substate changes
to RequestsPending. However, because a LIFT_REQUEST event
never arrives, as explained in the previous section, the
RequestPending substate can never be reached. After a
simulation run under these circumstances, the SMILE-X tool can
also report on all unreachable states.
a lift to the floor they are on, they press the floor button. This is
an external user event, which can be specified in SMILE-X
language and subsequently included in the simulation. This event
is received by the Button object and converted into a
BUTTON_PRESS event sent to the Floor object (Figure 8). The
event is processed if Floor is in the Vacant state. Ignoring guard
conditions and actions (not shown in the diagram) as they are
irrelevant to this scenario, the Floor object eventually moves to
the Waiting state and sends the SERVICE_REQUEST to the
Controller object.
4.10 Disconnected subsystems
In the absence of sequence (or communication) diagrams,
the state automatons are initially disconnected and there is no
communication between various subsystems. SMILE-X provides
a means of mapping events generated by an element to the
intended target element. For example, the Floor objects generate
DOOR_OPEN, DOOR_CLOSE and SERVICE_REQUEST
events for its environment (Figure 8). The first two are intended
for the Door objects, while the third is for the Controller object.
Figure 11 is a screenshot from the SMILE-X tool showing how
this mapping is done in a straightforward fashion.
Figure 7. The Floor state machine
Figure 8. The Controller state machine
This request, however, never gets serviced. Quick scrutiny
of the Controller state machine reveals that a LIFT_REQUEST
event, rather than SERVICE_REQUEST, is expected from the
Floor object. Hence, the LIFT_REQUEST event is never used
and as a consequence the lift never arrives at the floor which
issued the request. The ability to detect and report unused events
is not a direct language capability of SMILE-X, but that of the
SMILE-X simulation tool. Nevertheless, it is the SMILE-X
language metamodel which indirectly enables the tool to expose
unused events.
Figure 9. Controller state machine with substates
4.9 Unreachable states
Figure 10. Event mapping in SMILE-X
The tool can also detect unreachable states which may
sometimes prove to be simply superfluous or otherwise signify an
105
We have identified deficiencies and fragmentation in the
current approaches and have provided an initial framework that
checks for incompatibilities both at the structural as well as the
behavioural level. Here, we have focused on the latter and
described how we have identified some of the more important
incompatibilities in behavioural descriptions of models. We
directed our attention to a specific system type - the reactive
system - which represents a large group of complex, large-scale
software systems today. Behaviour can be described in various
ways. UML defines several different kinds of behaviour
diagrams, two of which are most commonly used: the sequence
diagrams and the state machine diagrams. In this paper, we have
focused on exactly these types of behavioural descriptions.
4.11 Out of sequence messages
Sequence diagrams show how objects (or, processes)
operate with one another and in what order. In SMILE-X,
sequence diagrams are primarily used in order to extract the
dependency information between various objects in the system in
order to create a 'source object event target object' map that
identifies which elements generate which events as well as
which elements are recipients of these events. Once extracted
from the input models, the sequence (order) of messages (events)
exchanged between the elements is recorded internally in the
SMILE-X notation. Manual adjustments of timing parameters
(service times) of the elements' actions, as well as the priority of
events, is part of the normal analysis and the refinement process
that occurs between simulation runs. In particular scenarios
(using a particular set of simulation parameters), the sequence of
messages exchanged may become different to the intended one.
SMILE-X can detect situations like this. Moreover, SMILE-X
notation allows for the event sequences to be specified manually,
by the tool user. It is not necessary that all events are described just the key ones. This enables the designers to ensure that
particular events appear in order. For example, we may want to
make certain that the DOOR_OPEN event always occurs after
the LIFT_ARRIVED event.
We have extended our existing SMILE platform to include
SMILE-X - the behavioural component. SMILE-X builds on the
previous structural component (SMILE-S) by providing a
mechanism to add semantics to the existing structural elements.
SMILE-X comprises a language and a tool. The language
champions interchange between differing model descriptions of
system behaviour using the standard XML format. SMILE-X
models are not compiled but simulated. Our tool provides a way
of loading the input models, refining and extending behavioural
descriptions in SMILE-X format, specifying various simulation
parameters, and it also includes an execution engine which
enables user to run simulations.
4.12 Deadlocks
Deadlock is a condition when two or more software objects
(processes) are waiting for each other to release a resource, or
are waiting for resources in a circular fashion. Typically,
deadlocks are a widespread problem in multiprocessing where
multiple processes share a specific type of mutually exclusive
resource known as a lock or soft lock. Deadlocks can be
identified in state machines by an object which cannot leave a
particular state even though it is receiving events that should
cause a transition. This typically happens if a guard condition
does not hold true either every time that event arrives or for a
long successive number of event arrivals. SMILE-X can detect
and report on these situations.
We have managed to identify seven different generic
behavioural incompatibilities related to the state machine
descriptions as well as sequential communication between
components of the system. We have used a case study which
models system components in different, vendor-specific, versions
of UML. In this proof-of-concept study, we transformed the input
models into the SMILE-X interchange format, as well
manipulated, refined and glued these models together in order to
perform meaningful simulations.
There are a number of interesting directions in which to go
next. We feel that one of them would be to - through the
observation of particular behavioural attributes such as events or
state - identify general purpose patterns to automatically detect
some of the behavioural incompatibilities. Another opportunity is
to see if causal paths can be uncovered. These would ideally (in
the state machine scenario) include paths that lead to same states
and events, as well as the ability to show alternative paths
between states (if such paths exist).
4.13 Properties that do not hold
Often, we would like to reason about invariants in the
context of state transitions in the form of a guard condition which
holds true throughout the entire transition. The evaluations of
such guards are performed at the following points in time: (i) at
the arrival of the event; (ii) after each transition action is
executed; (iii) just before the change of state.
6. ACKNOWLEDGMENTS
SMILE-X can monitor these situations, and detect and
report if those kinds of guards fail to hold.
This work was undertaken at SSEI (Software Systems
Engineering Initiative), an MOD (Ministry of Defence) funded
strategic initiative intended to enhance through life capability
management for software intensive defence systems.
5. CONCLUSIONS AND FUTURE WORK
We believe that model integration at the architectural level
is of great importance. It provides a way of detecting and
resolving issues that would not otherwise become apparent until
very late in the project - during the system integration phase.
This approach may not only reduce the typical risks associated
with integration of systems whose components are developed in a
distributed fashion, but can also substantially reduce the
development costs.
7. REFERENCES
1. openArchitectureWare. 2008.
http://www.openarchitectureware.org/.
2. ATLAS Group. Atlas Model Weaver (AMW). INRIA, Eclipse,
2008. http://www.eclipse.org/gmt/amw/.
106
3. Balasubramanian, K., Schmidt, D.C., Molnár, Z., and Lédeczi,
Á. System Integration using Model-Driven Engineering. In P.F.
Tiako, Designing Software-Intensive Systems: Methods and
Principles. IGI Global, 2009, 474-504.
18. Letkeman, K. Comparing and merging UML models. IBM
Rational Software Architect, IBM Developerworks, (2005).
19. Melnik, S., Rahm, E., and Bernstein, P. Rondo: A
Programming Platform for Generic Model Management.
SIGMOD, ACM, (2003).
4. Basu, A., Bozga, M., and Sifakis, J. Modeling Heterogeneous
Real-time Components in BIP. Fourth IEEE International
Conference on Software Engineering and Formal Methods
(SEFM'06), (2006), 3-12.
20. Microsoft. .NET Framework 3.5. 2008.
http://msdn.microsoft.com/en-us/library/w0x726c2.aspx.
21. Miller, J. and Mukerji, J. Model Driven Architecture (MDA).
2001.
5. Batini, C., Lenzerini, M., and Navathe, S. A Comparative
Analysis of Methodologies for Database Schema Integration.
ACM Computing Surveys 18, 4 (1986), 323-364.
22. OMG. Model Driven Architecture ( MDA ). Architecture,
2001, 1-31.
6. Chen, D. and Doumeingts, G. European initiatives to develop
interoperability of enterprise applications—basic concepts,
framework and roadmap. Annual Reviews in Control 27, 2
(2003), 153-162.
23. OMG. Interactive Objects and Project Technology, MOF
Query/Views/Transformations. 2003.
24. OMG. MOF QVT Final Adopted Specification. 2005.
7. Czarnecki, K. and Helsen, S. Feature-based survey of model
transformation approaches. IBM Systems Journal 45, 3 (2006),
621-645.
25. OMG. Common Object Request Broker Architecture.
Management, 2006.
8. Davis, J. GME : The Generic Modeling Environment.
OOPSLA '03: Companion of the 18th annual ACM SIGPLAN
conference on Object-oriented programming, systems,
languages, and applications, ACM (2003), 82-83.
26. OMG. MOF 2.0/XMI Mapping, Version 2.1.1. 2006.
27. OMG. Meta Object Facility (MOF) Core Specification
(version 2.0). Management, 2006.
9. Eclipse. VIATRA. 2008.
http://dev.eclipse.org/viewcvs/indextech.cgi/gmthome/subprojects/VIATRA2/index.html.
28. Pottinger, R. and Bernstein, P. Merging Models Based on
Given Correspondences (Technical report). 2003.
29. Vangheluwe, H. and Lara, J.D. An Introduction to MultiParadigm Modelling and Simulation. AI, Simulation and
Planning in High Autonomy Systems (AIS 2002), Society for
Modeling and Simulation International (SCS) (2002), 163-169.
10. Eclipse. Eclipse Modeling Project. 2010.
http://www.eclipse.org/modeling/.
11. Hardebolle, C. and Boulanger, F. Exploring Multi-Paradigm
Modeling Techniques. Simulation 85, 11-12 (2009), 688-708.
30. Vernadat, F. Enterprise Modeling and Integration:
Principles and Applications. Springer, 1996.
12. Harel, D. and Rumpe, B. Meaningful Modeling: What’s the
Semantics of "Semantics"? Computer 37, 10 (2004), 64-72.
31. Wilson, R.J. Introduction to Graph Theory. Longman, 1985.
13. Jouault, F., Piers, W., and Wagelaar, D. ATLAS
Transformation Language. Eclipse, 2008.
http://www.eclipse.org/m2m/atl/.
32. World Wide Web Consortium (W3C). XML Schema 1.1.
2006. http://www.w3.org/XML/Schema.
14. Kolovos, D., Paige, R., Rose, L., and Polack, F. The Epsilon
Book. Structure, 2010, 178.
33. de Lara, J., Levendovszky, T., and Mosterman, P.J. Guest
Editorial: Special Issue on Multi-paradigm Modeling. Simulation
85, 11-12 (2009), 685-687.
15. Kong, C. and Alexander, P. The Rosetta meta-model
framework. 10th IEEE International Conference and Workshop
on the Engineering of Computer-Based Systems, 2003.
Proceedings., (2003), 133-140.
16. Lara, J.D. and Vangheluwe, H. AToM 3 : A Tool for Multiformalism and Meta-modelling. Fundamental Approaches to
Software Engineering, Springerlink (2002), 174-188.
17. Lara, J.D., Vangheluwe, H., Posse, E., Indrani, A.V.,
Provost, M., and WeiBin Liang. AToM3. 2010.
http://atom3.cs.mcgill.ca/index_html.
107