DataWarehouse.com | Brought to you by DMReview
Home Ask the Experts Forums iKnowledge Resources My dataWarehouse
Advanced Search
News
Articles
Ask the Vendors
   Meet the Experts
   Ask a Question
   View the Archive
Certification
   Partners

SOA Forgot the Data: Composite Data Services and Data Governance

Summary: This is an introduction to composite data services, a powerful framework in combination with XML data management, SOA registries and repositories.

The authors would like to thank Bob Albo, VP of Business Solutions at Raining Data, and Murty Gurajada, software architect, XML-Centric Applications and Platforms group at Raining Data, for technically reviewing this article.

Many organizations today are moving steadily toward implementing a service-oriented architecture (SOA) for standards-based software interoperability and business flexibility. However, most forget about the data integration, governance and management issues associated with true interoperability until it's too late. As loosely coupled systems based on SOA begin to interact, data integration, quality and harmonization issues are exacerbated and become significant barriers to successful integration efforts. These problems stem from not treating “data” or “information” as a critical asset, contrary to the way people, capital equipment and inventory are viewed. In order for organizations to be successful with their SOA implementation, they must first recognize data as a business-critical asset.

According to Gartner, service-oriented business applications (SOBAs) require a robust set of services that capture, manipulate, transform and reconcile data and semantics. Data services that accomplish detailed transaction manipulation and provide a transparency of business rules, semantic mappings and metadata management enable the necessary linkage and binding between process and information when deploying composite applications via SOA techniques.

The concept of data services is rapidly gaining interest as an approach for addressing data integration and governance challenges in SOA. Data services will increasingly become recognized as a critical component of SOA initiatives.

The authors of this article will introduce readers to composite data services, a powerful framework in combination with XML data management, SOA registries and repositories. This framework empowers SOA and maximizes the governance and accessibility of information.

Problem: Current SOAs Overlooked Data and Metadata Integration

Take a step-by-step approach to understand the data duplication, quality and consistency issues that come up when traditional monolithic architectures evolve to loosely coupled architectures.

As shown in Figure 1, a traditional monolithic architecture involves an application interacting with data directly:

Figure 1: Traditional Monolithic Architecture

As seen in Figure 2, a loosely coupled architecture based on SOA principles involves an application interacting with data through business services. Consider a real-world scenario where, if an application needs to get address data spread across many data sources, it would interact with a business service that provides this data by accessing the various data sources.

Figure 2: Loosely Coupled Architecture

While this satisfies the requirement for a loosely coupled architecture, it introduces several other issues such as performance and scalability considerations in the SOA. Moreover, consider the real-world scenario where customer address data is inconsistent across the data sources. This is a fundamental issue that the business service might try to resolve using complex business logic and code. As we know, any business service implementation with even the slightest amount of complexity can quickly become a nightmare to manage. Moreover, transparency of the business logic used by the business service and flexibility for customizing this business logic typically get left by the wayside.

As interest in SOA grows, questions remain about the role of data when building composite applications. XML and simple object access protocol (SOAP) merely lower the barriers to interoperability. It is still a lot of hard work to adequately resolve issues around information location, context, meaning and accuracy. A key to successfully implementing an SOA is to use a flexible framework that helps organizations with these issues as well as understand and define the relationships between business-centric and data-centric services.

Solution: Introducing XML-Centric Data Components and Information Services

The solution lies in the specialized layer shown in Figure 3 called composite data services that can take care of data and information inconsistencies, quality, accuracy and harmonization across heterogeneous sources.

Figure 3: XML-Centric Composite Data Services

Various classes of business services exist in SOA, including infrastructure services (which provide low-level functions, such as messaging, registration and authentication) and business services (which perform higher-level business functions, such as processing an order).

A data service is a new class of service which performs data-centric tasks (such as data access, integration, transformation, analysis, monitoring, movement, profiling, enrichment, validation, verification, quality, governance, etc.). From a technical implementation perspective, like traditional business services, data services also support the three important principles of service orientation, namely modularity, well-defined interfaces and loose coupling.

As shown in Figure 4, atomic low-level data services are combined to create composite data services to support sophisticated data-centric requirements such as data governance, master data management, data cleansing and enrichment. Composite data services are optimally developed using an XML-centric, data-driven workflow designer and deployed and orchestrated on an XML-centric workflow engine.

Figure 4: Composite Data Services

XML-centric composite data services are a compelling feature for any SOA as they natively handle SOA artifacts which are predominantly XML-based and when combined with a native XML database, can flexibly enable sophisticated data integration features for accessing, cleansing, transforming, mapping, aggregating and moving data. Additionally, when combined with a mid-tier write-through cache data service to back-end data sources with compliance to XA transactions and optimized refresh policies, composite data services can seamlessly address the scalability and performance requirements that SOAs demand.

As mentioned earlier, composite data services can be developed to address a number of typical data-centric activities and challenges in SOA, as outlined below in Figure 5.

Figure 5: Types of Data Services
 

If you applied composite data services to the aforementioned real-world scenario, the address data spread across various data sources and formats would be composed into clean, accurate information delivered in a timely fashion to the consuming business service or SOBA. The support for the graphical composition of the composite data services affords ease of use and flexibility to even nontechnical users.

Introducing Data Governance – Needs and Benefits

Data and information assets are currently overlooked in most governance initiatives. SOA governance, the latest buzzword these days, is all about the lifecycle management and access controls of SOA services. While this is a critical aspect of governance, it is equally important to implement data governance, which has to do with lifecycle management and access controls for enterprise data.

In an analysis published in November 2006, Gartner analyst David Newman states, “By 2010, more than 50 percent of early adopter organizations migrating toward SOA will fail at their first attempts, due to a lack of rigor in enforcing data governance and information management policies.”

The complexity and amount of data that an enterprise needs to harness and manage continues to grow. Information is a vital enterprise asset and is critical to business success. Consistent and accurate information is indispensable to enterprise resource planning systems, customer relationship management applications, business intelligence tools and certain classes of corporate documents (e.g., regulatory compliance). Most enterprises make significant investments in computer systems but limited investments in ensuring information quality. Information, like a physical asset, degrades with time and use. It is thus imperative to have a data governance model that ensures information is leveraged and consistent across the enterprise.

Some aspects of data governance may be achieved using business rules; however, the governance model includes important roles and responsibilities for human participants. Data stewards, data captains and data governance committees play an important part in governing processes (e.g., data change and approval process), enterprise standards (e.g., metadata and policies) and technology (data governance tools, data-centric workflow, XML databases, XQuery). The roles played by human participants are often not full-time jobs, so the enabling technologies must be extremely flexible, have little to no learning curve and have intuitive and information-rich user interfaces to support data governance processes for high-value data.

Composite data services can alleviate the SOA migration risk described by Gartner through enhanced governance, risk, compliance, security and quality. Composite data services will have a dramatic effect on improving enterprise data governance as indicated in Figure 6.

Figure 6: Data Governance Practices Comparison

Overview: XML and XQuery Data Components and Services

As shown in Figure 7, overarching business processes and applications can consume the modular and reusable composite data services as a part of the business logic by rapidly integrating with open standard endpoints designed to provide loose coupling. The figure also represents the recommended technology stack required to deliver high performance data management through composite data services in a SOA. This stack is comprised of an XML data management server for the natural persistence of SOA data, a SOA registry repository for lifecycle management of all SOA artifacts and a data-centric workflow infrastructure for collaboration and governance. Further, the XML data management server enables a metadata cache for optimized data retrieval, a master data management repository for a single source of truth and a message repository for auditing and lineage.

Figure 7: Recommended Technology Stack for Data-Centric SOA

Most SOA data, artifacts, metadata and messages are XML, so an XML database offers an ideal repository; XQuery is an ideal language for SOA data access and manipulation. To support non-XML artifacts, a good XQuery technology should provide support for querying both XML databases and non-XML data sources such as RDBMS, file systems, Web services and Java applications. Such XQuery implementations are wrapped in Web services and orchestrated into composite data services to solve the data-centric problems of SOA implementations.

Composite data services should be easily organized into taxonomies and persisted in an XML database along with all of the technical, business and operational metadata that may describe such services. XML databases fill a critical gap in SOA by providing an enterprise grade persistence infrastructure that can be used for developing flexible and highly nested taxonomy and ontology meta-models as well as data governance, lineage and impact analysis.

Composite data services empower SOA and data governance by eliminating the need to write code and abstracting out the high degree of complexity of data-centric tasks from SOA business processes. They provide coarse-grained orchestration of multiple data sources, improved performance, sophisticated search and discovery, flexibility and standards adherence. They improve regulatory compliance and enhance visibility and control of business-critical data and information assets.

References:

  • Mark A. Beyer, David Newman, Daniel Sholler and Ted Friedman. “The Emerging Vision for Data Services: Becoming Information-Centric in an SOA World.” April 24, 2006.
  • David Newman. “EIM Reference Architecture: An Essential Building Block for Enterprise Information Management.” September 14, 2005.
  • David Newman and Ted Friedman. “Data Integration Is Key to Successful Service-Oriented Architecture Implementations.” October 12, 2005.
  • Ivan Chong and Ashutosh Kulkarni. “Enterprise Data Integration A critical piece of a Service-Oriented Architecture.” February 24, 2006.
  • Ash Parikh and Murty Gurajada. “SOA For the Real World.” November 29, 2006.

Ash Parikh is the director of technology and development for the XML-Centric Applications and Platform Group at Raining Data Corporation. He can be reached at ash@rainingdata.com.

 Ajay Ramachandran is CTO and vice president of XML-Centric Applications and Platforms at Raining Data.
 
Premal Parikh is a lead architect/engineering manager at Raining Data Corporation. He has more than 13 years of experience in development of enterprise software products and solutions, which includes designing and architecting products.
 

Editorial Calendar Writers' Guidelines Advertising Info About Us Contact Us Site Map

SourceMedia