Scientific Workflow Management Systems

in e-Science environments

Notes and collections

 

Zhiming Zhao Ph.D.

 

Created 5/8/2005,

Last updated 18/03/2008

 

Informatics Institute, University of Amsterdam

Kruislaan 403, 1098SJ, Amsterdam, the Netherlands

Email: zhiming@science.uva.nl

WWW: http://staff.science.uva.nl/~zhiming

 

 

Where

Task

Goal

State

Workflow bus

Virtualisation

Abstract different visions on workflow bus

-         It is a meta workflow

-         It is a integration framework

-         It is a coordination agent

-         It is a federation

-         It is a runtime infrastructure

 

Requirement:

-         Meta engine

Websphere

General study, use and etc.

Community of websphere (Websphere.Org)

About Websphere: a family of tools/components provided by IBM to web development (A quick tutor)

-         Application server

-         Development studio

-         Others

 

BPEL and Scientific computing

 

Understand three things:

-         What are the things need to be standardised in scientific computing

-         What are the existing solutions in scientific computing;

-         How can industrial standards help these issues.

Data base

Postgresql

Data base related issues:

-         administration

-         access and operation

Java 2 WSDL,

WSDL 2 JAVA

Automatically generate Java or WSDL files.

The basic idea: From Java interface to XML (From Java objects to a SOAP web service using Apache AXIS | )

[1] AXIS is a SOAP server, which works with an servlet container such as application server TOMCAT.

[2] When install TOMCAT, check two things: the compatibility between TOMCAT and JVM, and the user name account in the ./conf/tomcat-users.xml

[3] Do the configuration for the AXIS, so that it can  be accessed via the TOMCAT. You should be able to validate the installation (see details here.).

[4] Add new web services using AXIS. It contains two steps: get the classes and libraries into AXIS war directory, and register it to the axis engine. 

[5] Simple JWS webservice. No packages in the pages. The code is compiled at runtime.

Ptolemy

Understand actor, director, and other concepts.

How to use Ptolemy to construct an application;

How to construct an actor or a director

 

Kepler

Wrap a tool into Kepler actor

 

Taverna

Wrap a tool into Taverna WS

 

Triana

Wrap a tool into Triana tool

 

WSRF

Related to the first one

 

FIPA

 

 

Integrating Ontology

 

 

FIPA and WSRF

 

 

JADE

Remote agent launch

 

Bash script

String

#!/bin/bash
line=’this “is” a command;this “is” a pattern’
COMMAND=${line%;*}
PATTERN=${line#*;}
echo $COMMAND
echo $PATTERN

 

 

 

Resource registry

UDDI

-         it is a standard, www.uddi.org. Universal, description, discovery integration.

-         It is now V3. http://uddi.org/pubs/uddi-v3.00-published-20020719.htm

-         Commercial and open source tools: http://uddi.org/solutions.html

-         Developed for WS;

-         Core data structure

-            - Business entity -> contain business service. Business entity has a unique business key;

-            - business service; -> contains binding template;

-            - binding template; -> reference to tModel;

-            - tModel (technology model);

 

Limitation of UDDI (http://www.ggf.org/GGF16/materials/OMII-UK-GGF16/OMII-UK-GGF16-GRIMOIRES.ppt)

         Rigid data model with limited extensibility

        Prevents customised service description

         WS technical interface is expressed as an outgoing link (through tModel).

        Prevents discovery based on technical interface detail

        Burden is on clients to parse multiple external documents and reason over them

         No metadata support published by third parties

         No Lifetime management

 

Extension of UDDI. UDDI + service annotation service: Grimoires (http://www.ggf.org/GGF16/materials/OMII-UK-GGF16/OMII-UK-GGF16-GRIMOIRES.ppt)

1)      install omii.

2)      Install grimoires. Check the document. When run the GShell, use ant run at GShell, and set correct classpath. One thing is change the build.xml correct the location of the services: :8080/grimoires/services

Registry

DESIRE registry

http://desire.ukoln.ac.uk/registry/

Basic Semantic Registry

http://www.iso.ch

ISO 11179

http://pueblo.lbl.gov/~olken/X3L8/activities.html

BIBLINKCore

http://hosted.ukoln.ac.uk/biblink/wp8/fs/bc-semantics.html

Dublin Core

http://mirrored.ukoln.ac.uk/dc/

ROADS Templates

http://www.ukoln.ac.uk/metadata/roads/templates/

 

Industrial vendors

Accumen(http://www.acumentechnologies.com/),

Webmethods (http://www.webmethods.com/)

Examples

Kepler Library

 

 

 

 

Java notes

Override

                                 public  class testB {

                                                                public void writeA()

                                                                {

                                                                                System.out.print("in testB");

                                                                }

                                                                public void writeCA()

                                                                {

                                                                                writeA();

                                                                }

                                                }                             

                                 public class testA extends testB {

 

                                                                /**

                                                                 * @param args

                                                                 */

                                                                public void writeA(){

                                                                                System.out.println("in testA");

                                                                }

                                                                public void writeCA(){

                                                                                super.writeCA();

                                                                }

                                                }

 

                      testA tt=new testA();

                      tt.writeCA();

 

The output should be “testA”, if comment writeCA() in testA, the results is still “testA”.

 

 

Agent and Grid computing

How agents meet Grid security, VO, etc.

-         Security inheritance

-          

How agents live in Grid

-         Migration

-         Dynamic deployment

-          

How agent service people

-         Information probing

-         Execute workflow

-         Aggregate workflows

-         Discover services

Compared to other techniques: what are the added value

Compared to OGSA, services

 

Sustainability

 

Involved technologies in Agents

 

 

Research plan

 

 

 

 

REO and constraint automata

Install plugin in Eclipse

 

 

 

 

 

JADE and workflow bus

Get Jade

From website: http://jade.tilab.com

In Eclipse

http://www.mars-team.com/. Download the jar file (plugin) and put it in the eclipse plugin sub dir.

- Check the compatibility between Eclipse and Plugin

Ontology and ACL in JADE

http://protege.cim3.net/cgi-bin/wiki.pl?OntologyBeanGenerator

Ontology bean generator in Protege

Compile examples

 

Start Agents

 

Notes of JADE

If you just need to get the list of all agents in the platform you can simply use the search() method of the jade.domain.AMSService class.

 

 

java jade.Boot [options] [AgentSpecifier list]

java jade.Boot –container [options] [AgentSpecifier list]

java –jar lib\jade.jarnomtp [options] [AgentSpecifier list]

Obtain agent list

 

Obtain container list

 

 

Local main container

Local container

Remote main container

Remote container

Container

Java jade.Boot

Java jade.Boot -container

NO

Java jade.Boot –host XXX -container

Agent

Java jade.Boot [–host NAME] [-container] agentName:agentExecutable

Java jade.Boot [–host NAME] [-container] agentName:agentExecutable

Java jade.Boot [–host NAME] [-container] agentName:agentExecutable

Java jade.Boot [–host NAME] [-container] agentName:agentExecutable

 

Containers

Main-Containers

RMA GUI

Migrate sub menu or RMA

Migrate sub menu RMA

Command line

 

 

In program

 

 

Fault tolerant

 

 

 

 

 

Styx and Griddles

Gridldles

 

 

Styx

 

 

 

 

SQL, mysql

Basic command

Create database DATABASENAME

Use DATABASENAME

Create table TABLENAME (COLUMNE Name Type (SIZE) NULL/Not NULL, );

Insert TABLE (COLUME LIST) values(values)

Select COLUMNE from TableName where

Select Count(COUMNE) from Table where

Install mysql

Unpack the package

~/bin/safe_mysql

~/support_files/mysql.server

 

to run automatically,

put mysql.server /etc/init.d/mysqld

change +x mysqld

change certain configuration in /etc/mysql.cf

Dump/input

Mysqldump –u XXX –pPPP databasename > backname;

Mysql –u XXX –pPPP databasename < backname;

URL

jdbc:mysql://localhost:3306/DATABASENAME

Administration

Add user

Log as root

Add value to the user data base

Ø      mysql –u root –p

mysql> GRANT ALL PRIVILEGES ON *.* TO monty@localhost
    ->     IDENTIFIED BY 'some_pass' WITH GRANT OPTION;
mysql> GRANT ALL PRIVILEGES ON *.* TO monty@'%'
    ->     IDENTIFIED BY 'some_pass' WITH GRANT OPTION;
mysql> GRANT RELOAD,PROCESS ON *.* TO admin@localhost;
mysql> GRANT USAGE ON *.* TO dummy@localhost;
 

Ø      grant all on *.* to user@localhost;

Ø      grant all on dbname.* to user@localhost;

Ø      grant all on dbname.* to user@localhost identified by “some password”;

 

Set password

Set password for ‘someone’@’host’ = password(‘password’);

This is encrypted;

Set password for ‘someone’@’host’ = ‘password’;

Not encrypted;

Set password for ‘zhiming’@’rabbit’ = password(‘1234’);

 

Change privileges

Grant all privileges on DATABASE to USERLIST

Revoke

 

Show data base, tables, and fields

Show databases;

Show tables;

Describe TABLENAME;

 

others

Ø      mysql –user=root mysql

Ø      insert into user

Ø           values(‘localhost’, ‘zhiming’, password(‘password’), ‘Y’, ‘Y’, ‘Y’, ‘Y’, ‘Y’, ‘Y’, ‘Y’, ‘Y’, ‘Y’, ‘Y’, ‘Y’, ‘Y’, ‘Y’, ‘Y’);

Ø      flush privileges;

 

mysql> INSERT INTO user VALUES(localhost,monty,PASSWORD(some_pass),
->          Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y);
mysql> INSERT INTO user VALUES(%,monty,PASSWORD(some_pass),
->          Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y);
mysql> INSERT INTO user SET Host=localhost,User=admin,
->           Reload_priv=Y, Process_priv=Y;
mysql> INSERT INTO user (Host,User,Password)
è       VALUES(localhost,dummy,’’);
mysql> FLUSH PRIVILEGES;

 

 

 

Change password

Set password for ‘someone’@’host’=old_password(‘mypassord’);

Installation

Install the software

Open source Database front end

http://www.java2s.com/Product/Java/Database/ISQL-Database-FontEnd.htm

http://databasebrowser.sourceforge.net/

JAR howto

Basic

Jar fc jarname.jar *.class

Jar fx jarname.jar

Jar mfc mainclass.txt jarname.jar *.class// to make a executable jar.

A mainclass.txt looks like:

 

--

Ant-Version: Apache Ant 1.6.5

Created-By: 1.4.2_08-b03 (Sun Microsystems Inc.)

Main-Class: org.dbbrowser.DBBrowser

Class-Path: lib/DBBrowserHelp.jar lib/infrastructure.jar lib/jh.jar li

 b/jhall.jar lib/jhbasic.jar lib/jsearch.jar lib/junit.jar lib/log4j-1

 .2.8.jar lib/looks-1.3.1.jar lib/skinlf.jar lib/activation.jar lib/ma

 il.jar lib/smtp.jar lib/pop3.jar lib/mailapi.jar lib/imap.jar lib/syn

 tax.jar lib/itext-1.3.jar lib/jxl.jar lib/xstream-1.1.3.jar lib/xpp3-

 1.1.3.4d_b4_min.jar lib/axis.jar

--

Install sesame

-         Download install tomcat;

-         Download and install database, e.g., mysql;

-            - create database names;

-            - create sesame user;

-            - grant privilege to sesame; (not only database level but also global level. See details: (http://dev.mysql.com/doc/refman/5.0/en/grant.html )

-         Download sesame (www.openrdf.org). Note, sesame 1, and 2 are not compatible

-            - Unpack sesame to tomcat. Jar xvf sesame.war into $CATALINA/webapp/sesame

-            - configure system.conf

-         Restart tomcat app server

-          

 

-          

Handy MySQL Commands

Description

Command

To login (from unix shell) use -h only if needed.

[mysql dir]/bin/mysql -h hostname -u root -p

Create a database on the sql server.

create database [databasename];

List all databases on the sql server.

show databases;

Switch to a database.

use [db name];

To see all the tables in the db.

show tables;

To see database's field formats.

describe [table name];

To delete a db.

drop database [database name];

To delete a table.

drop table [table name];

Show all data in a table.

SELECT * FROM [table name];

Returns the columns and column information pertaining to the designated table.

show columns from [table name];

 

 

Show certain selected rows with the value "whatever".

SELECT * FROM [table name] WHERE [field name] = "whatever";

 

 

Show all records containing the name "Bob" AND the phone number '3444444'.

SELECT * FROM [table name] WHERE name = "Bob" AND phone_number = '3444444';

 

 

Show all records not containing the name "Bob" AND the phone number '3444444' order by the phone_number field.

SELECT * FROM [table name] WHERE name != "Bob" AND phone_number = '3444444' order by phone_number;

 

 

Show all records starting with the letters 'bob' AND the phone number '3444444'.

SELECT * FROM [table name] WHERE name like "Bob%" AND phone_number = '3444444';

 

 

Use a regular expression to find records. Use "REGEXP BINARY" to force case-sensitivity. This finds any record beginning with a.

SELECT * FROM [table name] WHERE rec RLIKE "^a$";

 

 

Show unique records.

SELECT DISTINCT [column name] FROM [table name];

Show selected records sorted in an ascending (asc) or descending (desc).

SELECT [col1],[col2] FROM [table name] ORDER BY [col2] DESC;

Return number of rows.

SELECT COUNT(*) FROM [table name];

 

 

Sum column.

SELECT SUM(*) FROM [table name];

 

 

Join tables on common columns.

select lookup.illustrationid, lookup.personid,person.birthday from lookup
left join person on lookup.personid=person.personid=statement to join birthday in person table with primary illustration id;

Switch to the mysql db. Create a new user.

INSERT INTO [table name] (Host,User,Password) VALUES('%','user',PASSWORD('password'));

Change a users password.(from unix shell).

[mysql dir]/bin/mysqladmin -u root -h hostname.blah.org -p password 'new-password'

Change a users password.(from MySQL prompt).

SET PASSWORD FOR 'user'@'hostname' = PASSWORD('passwordhere');

Switch to mysql db.Give user privilages for a db.

INSERT INTO [table name] (Host,Db,User,Select_priv,Insert_priv,Update_priv,Delete_priv,Create_priv,Drop_priv) VALUES ('%','db','user','Y','Y','Y','Y','Y','N');

To update info already in a table.

UPDATE [table name] SET Select_priv = 'Y',Insert_priv = 'Y',Update_priv = 'Y' where [field name] = 'user';

Delete a row(s) from a table.

DELETE from [table name] where [field name] = 'whatever';

Update database permissions/privilages.

FLUSH PRIVILEGES;

Delete a column.

alter table [table name] drop column [column name];

Add a new column to db.

alter table [table name] add column [new column name] varchar (20);

Change column name.

alter table [table name] change [old column name] [new column name] varchar (50);

Make a unique column so you get no dupes.

alter table [table name] add unique ([column name]);

Make a column bigger.

alter table [table name] modify [column name] VARCHAR(3);

Delete unique from table.

alter table [table name] drop index [colmn name];

Load a CSV file into a table.

LOAD DATA INFILE '/tmp/filename.csv' replace INTO TABLE [table name] FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' (field1,field2,field3);

Dump all databases for backup. Backup file is sql commands to recreate all db's.

[mysql dir]/bin/mysqldump -u root -ppassword --opt >/tmp/alldatabases.sql

Dump one database for backup.

[mysql dir]/bin/mysqldump -u username -ppassword --databases databasename >/tmp/databasename.sql

Dump a table from a database.

[mysql dir]/bin/mysqldump -c -u username -ppassword databasename tablename > /tmp/databasename.tablename.sql

Restore database (or database table) from backup.

[mysql dir]/bin/mysql -u username -ppassword databasename < /tmp/databasename.sql

Create Table Example 1.

CREATE TABLE [table name] (firstname VARCHAR(20), middleinitial VARCHAR(3), lastname VARCHAR(35),suffix VARCHAR(3),
officeid VARCHAR(10),userid VARCHAR(15),username VARCHAR(8),email VARCHAR(35),phone VARCHAR(25), groups
VARCHAR(15),datestamp DATE,timestamp time,pgpemail VARCHAR(255));

Create Table Example 2.

create table [table name] (personid int(50) not null auto_increment primary key,firstname varchar(35),middlename varchar(50),lastname varchar(50) default 'bato');

 

 

Storage Resource Broker notes

The SDSC Storage Resource Broker (SRB) is client-server middleware that provides a uniform interface for connecting to heterogeneous data resources over a network and accessing replicated data sets. SRB, in conjunction with the Metadata Catalog (MCAT), provides a way to access data sets and resources based on their attributes rather than their names or physical locations.

The SDSC SRB system is a comprehensive distributed data management solution, with features to support the management, collaborative (and controlled) sharing, publication, and preservation of distributed data collections. The SRB also serves as middleware via a rich set of APIs available to higher-level applications and by providing a management layer on top of a wide variety of storage systems.

The SRB is an integrated solution which includes a logical namespace (i.e. the mapping of each data object (file) from a logical to physical name/location), interfaces to a wide variety of storage systems, high performance data movement (including parallel I/O), fault-tolerance and fail-over, WAN-aware performance enhancements (bulk operations), storage-system-aware performance enhancements ('containers' to aggregate files), metadata ingestion and queries (a MetaData Catalog (MCAT)), user accounts, groups, access control, audit trails, administration tools (including Java GUI tool, and installation scripts), user tools (including a Windows GUI tool (inQ), a set of SRB Unix commands, and Web (mySRB)), and APIs (including C, C++, Java, and Python). In addition to supporting large-scale data access (many millions of files, terabytes of data) across Wide Area Networks (distributed SRB servers), recent versions also support multiple MCATs in independent, yet collaborating, administrative domains for additional performance, reliability, and administrative control and access. Also available is an SRB-based data grid workflow management system, SDSC Matrix, which can be used to create, access and manage workflow process pipelines.

 

Download and install it from SDSC http://www.sdsc.edu/srb/index.php/Main_Page.

The Metadata Catalogue (MCAT) database (One)

The SRB Server (One or more)

The SRB Client (more)

The MCAT database is a metadata repository that provides a mechanism for storing information used by the SRB system. This includes both internal system data required for running the system and application data regarding data sets being brokered by SRB e.g. your own metadata. SRB makes a clear distinction between these two types of data. At least one SRB Server must be installed on the node that hosts the MCAT database. MCAT performs the following data management operations:

  • Stores Metadata on Data sets, Users, Resources, Proxy Methods.
  • Maintains replica information for data and containers.
  • Provides Collection abstraction for data.
  • Provides Global User name space and authentication.
  • Provides Authorization through ACL and tickets.
  • Maintains audit trail on data and collections.
  • Maintains metadata for methods and resources.
  • Provides Resource Transparency - logical resources.

 

The SRB Server is a middleware application that accepts requests from clients and obtains the necessary data sets. It queries the MCAT database to gather information on datasets and supplies this back to the SRB client. The SRB server can operate in a federated mode, whereby it can request another SRB server to obtain the necessary data on its behalf. The data is then transparently passed back to the client.

The SRB Client is an end user tool that provides a user interface to send requests to the SRB server. There are 5 main implementations of this:

 

 

BPEL and related stuff

About BPEL

The BPEL language was originally developed by IBM and Microsoft and submitted to the Organization for the Advancement of Structured Information Standards (OASIS) in March 2003. Since then, it has gained wide support among the leading IT technology vendors (including Oracle, SAP, Siebel, and Sun) because it enables developers and managers to specify how Web services work together to provide an integrated, end-to-end transactional path. Interestingly, BPEL does not specify how the various Web services perform their tasks, but rather how to "orchestrate" or coordinate their activities into a business flow. Let's look at an example that demonstrates the comparative simplicity of the language >>.

 

BPEL provides a model for describing simple or complex exchanges that characterize business partner interactions. BPEL is a language for specifying the behavior of business processes by Web services, and as Web services. In BPEL, one can use standard Web services to invoke a process, and expose the resulting business process as a Web service. BPEL defines the elements for controlling workflow, the result of which is a fully-executable and portable script. Because a BPEL script will run on any BPEL-compliant engine, it's platform- and vendor-neutral.

 

The specification provides an XML-based grammar for describing the logic to control and coordinate Web services participating in a process flow. This grammar can be interpreted and executed by an orchestration engine, which is controlled by one of the participating business parties. The engine coordinates all of the activities in the process, and controls the system's corrective activities when exceptions occur. BPEL builds on and extends XML and Web Services specifications. BPEL is expressed entirely in XML, uses and extends WSDL, and uses WSDL and XML Schema for the data model.

 

BPEL in the context of service oriented architecture (SOA): "One of the major benefits that people seek from SOA is re-use, and combining individual applications into composite applications is a great way to accomplish that. Because the more you do that, the more you're getting closer to building a business process."

About BPM

Business Process Management or BPM, is the practice of improving the efficiency and effectiveness of any organization by automating the organization's business processes. BPM used to be also know as Business Process Reengineering (BPR). Many companies have business processes that are unique to its business model. Since these processes tend to evolve over time as the business reacts to market conditions, the BPM solution you choose must be easily adaptable to the new conditions and requirements and continue to be a perfect fit for the company. In order to use BPM effectively, organizations must stop focusing exclusively on data and data management, and adopt a process-oriented approach that makes no distinction between work done by a human and a computer. (>>)

BPEL

Specification (1.1).  Active BPEL, BPEL@Oracle

Business process modelling and description

Modelling

 

Description

BPEL

Concepts: (Partner links, partner link types, endpoint references, and variables)

Partner links are used to model peer2peer conversational partner relationships. Partner links define the shape of a relationship with a partner by defining the message and port types used in the interactions in both directions.

  • Partner link types model conversation relations by using roles. Each role exactly specifies one WSDL portType. Commonly, the portTypes from two roles are in different name spaces. It can also happen in one name space when it uses
  •  Each partner link is characterized by a partnerLinkType. Each partner link is named and is used for all service interactions via that partnerLink.
    • The role of the business process itself is indicated by the attribute myRole;
    • The role of the partner is indicated by partnerRole;
  • Business partners. While a partner link represents a conversational relationship between two partner processes, relationships with a business partner in general require more than a single conversational relationship to be established. BPEL uses partner element. Partner definitions are optional, but must not overlap. A partner link must not appear in more than one partner definition.
  •  Endpoint references. Port type defines abstract functionality by using abstract messages, ports provide access information, including communication endpoints and other deployment related information. 

Notions: (receive, reply, invoke, assign, throw, terminate, wait, empty, sequence, switch, while, pick, flow, scope, compensate)

Lifecycle of a business process (start, end)

  • The creation of a process instance is implicit. Activities that receive messages can be annotated to indicate that the occurrence of that activity cause a new instance of a business process to be created.
  • To be instantiated, each BP at least one “start activity”. If more than one start activity is enabled concurrently, then all such activities must use at least one correlation set that must use the same correlation set.
  • If exact one start activity is expected, the use of correlation set is unconstrained.
  • When activity defines the behaviour of the process as a whole completes, the BP terminates normally;
  • When a instance is explicitly terminated by terminate activity, then it is abnormally;
  • A compensation handler can be invoked after a BP is terminated normally.

Message properties. The data in a message consists conceptually of two parts: application data and protocol relevant data, where the protocols can be business protocols or infrastructure protocols providing higher quality of service.

  • Business protocols can be correlation tokens used in correlation sets;
  • Infrastructure protocols add implicit information (often called message context, related to security context, transaction context, and other middleware related context) as protocol types in a message. 
  • A property is defined as a globally unique name and associates with a XML schema simple type.
  • A typical use of property is to name a token for correlation of service instances with messages.
  • Properties used in business protocols are typically embedded in application visible message data.
  • Aliase is introduced to map a global property to a field in a specific message part.

Data handling. Business processes model stateful interactions. The maintenance of the state of a business process requires the use of state variables, which are called variables in BPEL4WS. 

  • Expressions. Boolean-valued expressions (transition conditions, join conditions, while conditions, and switch cases), deadline-valued expressions (until attribute of onAlarm and wait), duration-valued  (for attribute of onAlarm and wait), and general expression (Assignment).
  • Variables. The name of a variable should be unique in the name space.
  • Error handling in business processes. Error handling in business processes therefore relies heavily on the well-know concept of compensation, that is application specific activities that attempt to reverse the effects of a previous activity that was carried out as part of a large unit of work that is being abandoned.
  • Compensation hander

1.      define compensation handler

2.      invoke compensation handler: a compensation handler can be invoked by using a compensate activity, which names the scope for which the compensation is to be performed. A compensation handler for a scope is available for invocation only when the scope completes normally. An explicit invocation of a compensate activity when:

1.      In a fault handler of the scope that immediately encloses the scope for which compensation is to be performed;

2.      In the compensation handler of the scope that immediately encloses the scope for which compensation is to be performed.

·        Fault handler: Fault handling in a business process can be thought of as a mode witch from the normal processing in a scope.

1.      Semantics of activity termination

·        Event handler: events can be incoming messages that correspond to a request response, or one-way operation in WSDL.

1.      Message event

2.      alarm event

3.      enablement of events

4.      processing of events

5.      disablement of events

6.      fault handling considerations

7.      concurrency considerations

Correlation: Business processes are state-ful. The messages have to be delivered to the corrected ports of correct instances, according to the interaction context of the processes. In OO, object references are a basic mechanism to handle this. However, in SOA, this is not an optimal way. In the loosely coupled service world, an explicit reference will hamper the independent evolution of the business processes which are owned by different organisations. The use of implementation dependent tokens should be voided. In a business interaction context, the token of messages can be modelled as the message subjects or the content of the message itself. BPEL4WS provides correlation set to handle this type of tokens, and to provide instance routing automatically. XPATH.

  • Correlation set. BPEL4WS addresses correlation scenarios by providing a declarative mechanism to specify correlated groups of operations within a service instance. A ser of correlation token is defined as a set of properties shared by all messages in the correlated group. The set of the properties is called a correlation set.
  • Correlation sets are declared within scopes and associated with them in a manner that is analogous to variable declarations. Each correlation set is only visible in the scope in which it is defined and in all scopes nested within the scope it belongs to.
  • A global correlation set is in an uninitiated state at the beginning of a process. Correlation sets resemble late-bound constants rather than variables in their semantics.
  • BPEL4WS does not assume any sophisticated conversational transport protocols for messaging. In cases where such protocols are used, the explicit use of correlation in BPEL4WS can be reduced to those activities that establish the conversational connections.

Basic activities: each activity has optional standard attributes: a name, a join condition, and an indicator whether a join fault should be suppressed if it occurs.

  • Standard elements for each activity: each BPEL4WS activity has optional nested standard elements <source> and <target>.
    • The use of these elements is required for establishing synchronisation relationships through links. Each link is defined independently and given a name.
    • The link name is used as value of the linkName attribute of the source element. 
    • An activity may declare itself to be the source of one or more links by including one or more source elements.
  • Invoke web service operation
    • An invocation can be associated with more or more correlation sets;
    • WSDL fault is identified in BPEL4WS by a qualified name formed by the target namespace of the corresponding prototype and fault name.
    • An activity can be a compensation activity of another activity.
  • Providing web service operations. A business process provides services to its partners through receive activities and corresponding reply activities.
    • Receive. The only way to instantiate a business process in BPEL4WS is to annotate a receive activity with the createInstance attribute set to yes. By default it is no. A receive activity annotated in this way must be an initial activity in the process, the only other basic activities may potentially be performed prior to or simultaneously with such a receive activity must be similarly annotated receive activities.
    • It is permissible to have the createInstance attribute set to yes for a set of concurrent initial activities. All such receive activities must use the same correlation sets. And only one of the inbound messages carrying the same correlation set tokens actually instantiates the business process.
    • Reply. Only meaningful in synchronous interactions. An asynchronous response is always sent by invoking the corresponding one-way operation on the partner link.
  • Updating variable contents
  • Signaling faults
  • Waiting
  • Doing nothing

Structure activities.

  • Ordinary sequential control: sequence, switch and while.
    • Sequence contains one or more activities that are performed sequentially.
    • Switch structured activity supports conditional behaviour in a pattern that occurs quite often.
    • The While activity supports repeated performance of a specified iterative activity.
  • Concurrency and synchronisation between activities provided by flow;
    • Link semantics: links are not allowed to pass in a number of activities: while,
    • Dead path elimination:
    • Links and structure activities
  • Nondeterministic choice based on external events provided by pick.

Scopes.

Serialisation scopes.

Integration mechanisms

Architecture

C/S

 

P2P

 

Middleware

CORBA,

Enterprise application and services

OO

 

CO

Java Beans, J2EE,

SO

WSDL, UDDI, SOAP,

Other standards

(from myGrid page)

One overview of the complexity of workflow standards is given in Process Management Standards Overview (April 2003) or http://xml.coverpages.org/bpm.html

  1. BPEL http://www-106.ibm.com/developerworks/library/ws-bpel/ BPEL 1.1 from IBM May 2003

·         BPML (Business Process Modelling language) is a standard developed and promoted by BPMI.org (the Business Process Management Initiative). BPMI.org is supported by several organizations, including Intalio, SAP, Sun, and Versata. The Web Service Choreography Interface (WSCI) submitted in June 2002 to the W3C by BEA Systems, BPMI.org, Commerce One, Fujitsu Limited, Intalio, IONA, Oracle Corporation, SAP AG, SeeBeyond Technology Corporation, and Sun Microsystems. There is a substantial overlap between BPML and WSCI. See QUT Technical report FIT-TR-2002-05 for more information about the evaluation of BPML and WSCI using the patterns.

  1. BPML available from http://www.bpmi.org/
    • BPMI originally stressed BPML as a superset of BPEL, but their BPM stack of Sept 2004 (see http://www.bpmi.org/aboutus.htm ) places WS-CDL, BPEL and their BPQL (Business Process Query Language) as alternatives between their BPMN, BPSM, and BPXL and the Web Services Stack
  2. WS-CDL http://www.w3.org/TR/ws-cdl-10/ Web Services Choreography Description Language from W3C Web Services Activity
    • Emphasis here is not on a executable language but describing peer-to-peer collaborations between Web service participants by defining their behaviour from a global viewpoint
    • looks to have replaced WSCI WSCI http://www.w3.org/TR/wsci/ Web Service Choreography Interface W3C note 8 August 2002
    • and also WSCL http://www.w3.org/TR/wscl10/ Web Services Conversation Language W3C Note 14 March 2002
  3. XPDL see http://www.wfmc.org/standards/docs.htm from WfMC
  4. Business Process Specification Schema (BPSS) from ebXML available from http://www.ebxml.org/specs/#technical_specifications
  5. PSL http://www.mel.nist.gov/psl/index.html Process Specification Language although developed for a manufacturing systems integration perspective, this addresses many of the same issues.
  6. I-N-C-A http://www.aiai.ed.ac.uk/project/ix/inca/

but do read W.M.P. van der Aalst "Don't go with the flow:Web services composition standards exposed. Web Services - Been there done that?, Trends & Controversies", Jan/Feb 2003 issue of IEEE Intelligent Systems Ontoweb Process Content Standards for the Semantic Web is a state-of-the-art November 2001 report that illustrates the overlap between workflow language standards and those for the semantic web. See SWSL for additional information.

More information. Information from Oracle.

Standard organisation

  • WfMC: workflow management consortium
  • OMG: Object Management Group
  • BPMI: Business process management Initiative
  • OASIS: Organisation of Advanced structured information systems

Other BPELS

BPEL4people,

 

 

 

 

Related technologies in SWMS

Applications domains

 

Application oriented Grid applications (CrossGrid | EuroGrid (based on UNICORE)| )

Life science: OMIC: The omic sciences entail studying entire biological ‘sets’ (e.g., the set of all genes, all transcripts, all proteins, or all metabolites) en masse rather than one or a few at a time. Finally, integromics has been used to refer to the integration of multiple sources and types of omic data.

Knowledge management

Agent technology: (Foundation for Intelligent Physical Agents(FIPA), Open Agent Architecture (OAA))

Logic: Prolog (Amzi, SWI)

Ontology: (Editor (Cooperative-Ontology Development Environment (CO-ODE), OilED, OwlDoc, OwlValidator, protégé), Language (DAMIL+OIL/OWL, RDF), Logic(Description Logic), Reasoning (FaCT, Jena, Racer));

Semantic: (Grid | Web) Semantic Grid

Scientific workflow management system

Model: (control patterns | data patterns, BPEL, Is BPEL a workflow or a service flow language?)

Engine:

User support:

 

 

High level deployment: automatic composition, information query, data integration, and interoperability among workflow;

Meta description,

Provenance[1, 2],

Mapping,

Ontology,

Experiments:

Workflow: service/ component coordination

Main operations: Annotation (Editor (Cooperative-Ontology Development Environment (CO-ODE), OilED, OwlDoc, OwlValidator, protégé),), storage, advertisement, and discovery;

Service: tools, components.

Underlying stuff: semantic model (semantic network, Object oriented model), semantic language (DAMIL+OIL/OWL, RDF), and reasoning mechanism (FaCT, Jena, Racer).

Data: scientific data.

Information management

Text Mining (Lucene)

Computing management

Description language: (Resource Specification Language (RSL))

Monitoring (Active Measurement Project (AMP) | Bricks | Grid weather | MonaLisa (in VDT))

Parameter sweeping (Apples parameter sweeping template (APST) | Nimrod)

Scheduling and workload tools: (BoSS | Catalina(SDSC) | Clubmask | Grid Resource Allocation Management (GRAM) | Grid Job Submission and Monitoring (GridSAM) | GridWay | Koala | Loadleveler | Load Sharing Facility (LSF) Scheduler | Maui cluster scheduler | Moag workload manager | Portable Batch System (PBS) | Sun Grid Engine (SGE) | SGE-GT4 | WebSubmit)

Data management                       

Data base: (MySQL | ORACLE)

Data Integration (OGSA-DAI)

Data storage management: (Net Storage (NeST) | Storage Resource Broker (SRB))

File systems (gFarm File system)

Meta data management (P2P (Edutella | Pizza))

Replication (Replica Location Service(RLS))

Grid programming environment

Programming environment, tools and services (CoGKit (for building portal, support Java Python) | CrossGrid | Grid Application Toolkit (GAT) | GridSolve/NetSolve | Grid Sphere portal | Grid Superscalar | IBIS | NiNF | Open Grid Computing Environment (OGCE) | Ptolemy | TeraGrid)

Web service resource framework (Description | pyGridware)

Grid security

AAA: (Authentication, Authorization, Accounting)

Security: (Grid Security Infrastructure (GSI) | RSA | Security Socket Layer (SSL))

Service oriented infrastructure

.Net based techniques ()

Java based techniques (ETTK for Web Services and Autonomic Computing | Java | JWSDP )

SOAP (Apache AXIS), possible errors.

In Apache axis1, to enable attachment support:

[1] Add attachment option in *.wsdd file. <service name, …. Attachement = “MIME”> (see the reference manual of axis)

[2] download JSF and Java mail. Obtain them from sun website.

[3] reload the service;

Generic Grid middleware

Forum (APGrid | GGF | Grid RPC workgroup | PRAGMA)

Middleware (EGEE, Globus (Adm), LCG, Unicore, VDT)

P2P (JXTA)

Registry (Grimoires | UDDI)

Service Oriented Architecture and Infrastructure (Open Middleware Infrastructure Institute (OMII))

Programming environment

Development (Opensource)

Language (Java: (Runtime environment, Toolkit, Virtual machine, bytecode, coding), C/C++: (GNU), Script: (perl, python), shell (bash, tcsh))

Build (Ant (build file) )

OS

OS (Linux (kernel,| documentation), Unix (Solaris), Windows (9X, 200X, NT, XP))

Hardware

Computers (Flynn’s, 1986, classification, SISD (single processor computer), SIMD (vector computers), MISD , MIMD (tightly/loosely coupled))

Network (Beowulf | LAN | WAN | VPN | Internet)

Parallel computer (Shared memory multi-processors, distributed memory multi-computers).

Processors: (Intel (x86), AMD (Athlon), Cyrix, UltraSPARC, Itanium, others).

Storage (Disk, Array)


 

 

How to choose a SWMS: a deployment vision

 

We distinguish three levels from a scientific workflow management systems:

-         application level:

-         mapping, integration and rapid prototyping level:

-         resource level:

 

We think a proper SWMS needs provide support at each level and allows a scientist

Desired features

Reasons

Resource

Legacy tools

 

Standard middleware

 

Invoke resources from different systems

 

Rich library for workflow components

 

Application process

parallel processes/tasks in an experiment (Job farming);

 

explore large parameter space (Parameter sweep);

 

knowledge based information processing (semantic level data integration).

 

using a SWMS

a friendly user interface (preferably a GUI);

 

support the development of new workflow components (using java, scripts, C++, providing sufficient documentation and support);

 

execute tasks on distributed resources (clusters or Grid);

 

Human in the loop interaction

 

to interoperate with different workflow management systems.

 

 

 

 

 

Ptolemy II

History

Gabriel (1986 ~ 1991);

Ptolemy Classic (1990 ~ 1997)

Ptolemy II (1996 ~)

  1. A user interface tool kit called Diva;
  2. A interface for Ptolemy II called Vergil (based on Diva);
  3. Java Plotter;
  4. 3-D animation domain

Concepts

Actor, Director, Ports, Parameters

Actor oriented programming. Originally from Gul Agha in 1970s for autonomous reasoning agents. But unlikely Agha’s actors, the one in Ptolemy does not have its own control thread.

Three ways to create Ptolemy mode:

  1. visual
  2. XML description
  3. Ptolemy API

 

Models of computation

  1. Determinate
  2. statically schedulable
  3. time safe

 

Domains for actors:

  1. Component interaction: CI
  2. communication sequential processes CSP
  3. Continuous Time _CT
  4. Discrete Event DE
  5. Distributed Discrete Events DDE
  6. Dynamic data flow: DDF
  7. Discrete Timer DT
  8. Finite State machines FSM
  9. Giotto
  10. Graphics GR
  11. Heterochronous Dataflow
  12. Hybrid systems
  13. Process network
  14. Synchronous data flow: SDF
  15. synchronous /Reactive SR
  16. Timed Multitasking TM
  17. Wireless

 

 

 

 

Workflow management systems

From the survey we choose a number of SWMSs, we want to analyse how these features are implemented and how we can include them in the new design of the VL-e framework. .

 

 Key features

 

Including existing WS in the workflow

Enumerated problems

Kepler

Good GUI

How to invoke the key features

1.   Set up a WS actor

·         WS Actor from the library,

·         Type the URL of WSDL, and then choose the operations;

2.   Add a new actor (with existing WS Client)

  •  

3.   Run the simple benchmark;

 

Concept: port, (width, channel, parameter, value)

Kepler Library

1.       

 

Add a new actor:

1.      Using the existing command line actor template (make an instance).

2.      do write own actor:

 

Write your own actor:

  1. in $kepler/src, make a new directory;
  2. according to the package name, make the directory and sub-directory;
  3. write the java code, according to the Kepler actor convention: such as definition of ports, parameters, fire function;
  4. make a “actor”.xml file: describe the functionality and class location;
  5. make a MANIFEST.MF file.
  6. in both MANIFEST and “actor”.xml file, check if the actor id in the correct scope;
  7. at $kepler, run ant compile;
  8. you should find some new files in: ./build/classes, ./build/src
  9. after that at $kepler,  ant buildkarlib
  10. before re-launch kepler, 1) remove .kepler, .ptolemy from the home directory, and 2) set class environment variables;
  11. generate new kepler.jar file, you can view the source of the new actor;

Scripts

 s

Shell scripts

Perl scripts

Python scripts

 

Legacy package

Consider it as a script

Handling input and output

 

Java program

 

 

Web services

Service invocation interface

 

workflow

Wrap engine as an actor

 

Interface to Nimrod

 

 

 

Taverna

WS based resources

Run a command:

The path of the command/application has to be described.

How Taverna initialise local processes

1.        When you start a Taverna instance, it first initialize a workbench framework

2.        then it will start three default sub-windowns: advance mode explorer, workshop diagram, available service

3.        In the available service, it scans available taverna.properties files and then load the processes according to their types: local or remote

 

How to add a Taverna process:

1.        Develop the code, in the code, it has to include the following parts: Class, xml, editor, colour, icon, taskclass

2.        Develop a taverna.properties

3.        Make a jar file, or put them separately. In the Jar file,

Knowledge/semantics enhanced WS

 

 

 

Triana

Job farming

 

 

 

 

 

 

 

 

Eclipse notes

Use as a Java development environment. Add plug-in (OWL)

Create a project

From scratch

 

From existing sources

Create project for Ptolemy.

  • Download the Ptolemy source;
  • Crate a project;
  • Create a source directory;
  • Create a package called Ptolemy. The source files of the Ptolemy are in the ~\ptolemy directory;
  • Import source from the download directory;
  • Compile the source. You need download a couple of things (Javax.comm.jar (Linux, Windows), javax.media.jai, Java media framework jmf.jar, Joystick Java driver, X10 (Homelinux), Quicktimer(QTJava.zip), JTXA,

From existing build.xml file

 

Import an existing project

 

Manage the project

 

Execute the application

 

Store the project

 

Remove the project

 

 

Ongoing projects

Demon

Research

SP 1.4a

The iterative research cycle, inverse modelling

Knowledge infrastructure and scientific workflow

 

SP 1.4b

From field observations to spatial information

Agent based workflow framework

SP 1.4c

 

 

SP 1.5a

Running R scripts in VL-e on multiple nodes though a web service

 

SP 1.5b

Histone code case- semantic data integration:

[1] genome distribution of histone H3K4Me3

[2] transcription factor binding

 

Data: genome distribution of histone H3K4Me3,

         Transcription factor

         Knowledge model of genome distribution

         Knowledge model of transcription factor

Steps: Segment distribution data from ChIP-chip data,

         explore distribution data

         Annotation distribution data

         Annotation of the factor data

         Semantic match between genome distribution data and factor

        

 

Processes

Functionality

Scenarios

Obtain data

 

 

Browse data

 

 

Browse knowledge

 

 

Annotate

 

 

Update data repository

 

 

Update knowledge repository

 

 

Integrate

 

 

 

 

 

 

 

 

 

 

 

 

 

Service oriented architecture

Basic concepts in WSDL:

  • types:
  • messages,
  • portTypes,
  • bindings,
  • services

Core of SOA

SOAP middleware Apache-axis is a SOAP engine.

 

Server side;

  1. Publish web services with AXIS;
    1. JWS (Java web service) files – instant deployment;
    2. Custom deployment – introducing WSDD;
    3. Scoped services:

                                                               i.      Request scope, the default, will create a new object each time a SOAP request comes in for your service;

                                                             ii.      Application scope, will create a singleton shared object to service all requests;

                                                            iii.      Session scope will create a new object for each session-enabled client who accesses your services.

  1. Using the AdminClient to remotely configure the server;
  2. Service styles;
    1. RPC services use SOAP RPC conventions; are default. They follow the SOAP RPC and encoding rules. AXIS will deserialize XML into Java objects which can be fed to your service and will serialize the returned Java objects from your services back into XML. Objects will be encoded via multi-ref serialization, which allows object graphs to be encoded. 
    2. Document services; no SOAP encoding, but XML schema, but the Java objects will be bind with the XML document, the user only deals with the Java object. the entire element of a XML document will be handed to the service method as a single bean;
    3. Wrapped services; no SOAP encoding, but XML schema, but the Java objects will be bind with the XML document, the user only deals with the Java object. the items in the element of a XML document will be handed to the service method as different beams;
    4. Message services; the author will be required to code at the actual XM level. They will not be turned into Java objects by default.

Client side:

  1. service, call;
  2. set endpoint, add parameter, set return type;
  3. invoke
  4. untyped servers

XML Java data mapping in AXIS:

  1. AXIS can send via SOAP with restricted interoperability: Java collections
  2. AXIS can not send via SOAP:
    1. Arbitrary objects without pre-registration;
    2. Remote references
  3. Bean mapping and serialize. AXIS includes the ability to serialize/deserialize without writing any code, arbitrary java classes with follow the standard Java bean pattern of get/set/ accessory.

AXIS2

Download a axis2, and then put the axis2 to the container’s webapp directory.

Tasks

Setup experimental environment in Linux:

-         Install tomcat

1.      tomcat 4.2

2.      axis 1.3

3.      be aware of the compatibility between JVMs

-         Setup the user account ~/conf/tomcat-users.xml

-         Copy AXIS to the ~/webapp at the tomcat directory

-         Start the Tomcat server

-         Be aware of the differences between Tomcat and AXIS.

1.      Tomcat is a Java servelet server;

2.      AXIS is a SOAP