A question SAS programmers/users sometimes ask me is “What’s the difference between SAS Enterprise Guide (EG) and SAS Data Integration Studio (DI)?” Both run SAS code and can generate tables and reports… my one-line response is “it’s about those SAS jobs being SAS metadata objects.”
It’s about the Metadata
What do I mean by this? Well, in EG you work with tables and SAS code (either with EG tasks or hand written code). If you’re a SAS programmer whose upgraded from SAS Display Manager (aka BASE SAS) then you now get the benefit of a program editor with autocompletion, built-in syntax highlighting, integrated help and automatic code formatting. If you are an analyst then you get the power of being able to generate SAS code for your reporting and querying needs via point-and-click tasks within your EG project. In both situations you generate SAS code (perhaps even a stored process) all working off SAS tables that may have libraries assigned in metadata… and there is that word… metadata.
As I’m sure you know the general definition of metadata is “data about data“. So what is that… it’s details about the data you are working with. For example, if you have a customer table in your Oracle database then that metadata is about the Oracle library engine (assuming that you are using SAS/Access to Oracle), connection details to the Oracle database, the column attributes (such as name, type, length, formats etc), when the table was created, modified and any security access controls etc. So why is this important…
With the SAS 9 platform the SAS metadata server and associated metadata repositories are the heart of SAS. The SAS 9 Intelligence Platform Overview document provides a great background to help understand how all the pieces in the SAS platform fit together and the importance of keeping that metadata heart pumping.
Why do SAS Enterprise Guide Users Care about Metadata?
So as an EG user why do I need to know about SAS metadata? The metadata server enables consistent, centralized storage of information about all the resources in your environment. It stores information about many things including: users, groups, roles, capabilities, tables, columns, libraries, cubes, stored processes, information maps, DI jobs and many more. This means you can access many of these resources in EG without having to set them up yourself. Your administrator has already done this work for you and made it available in metadata so you can concentrate on your projects.
These EG projects, whilst using metadata are not commonly stored in metadata: they are often stored as .sas files or .egp files outside of metadata. This limits their availability for things like impact analysis, searching, metadata audit and security. DI Studio jobs are represented in metadata with a great deal of detail.
What is the Value of SAS DI Studio?
DI is a metadata-driven visual design tool where you create jobs and their components as metadata objects (potentially within an integrated change management environment). Ultimately SAS code is generated, although it is more of a by-product of the process. The job metadata is the primary “document” and is stored with enough detail in metadata to allow the additional benefits.
DI jobs are generally used for consolidating and managing enterprise source and target data with process flows that extract, transform and load (ETL) operational data into data warehouses and data marts. That process can be simple or complex depending on the organisation. Some organisations may start out using EG as an ETL tool and later realise that they have difficulty managing everything they need to do such as data validation, history, data cleansing (via handy data quality tools such as DataFlux), metadata reporting, impact analysis, complex scheduling of dependencies, and change management across environments (dev, test, prod).
So that’s why I consider the difference between EG and DI primarily relates to “those SAS jobs being SAS metadata objects“… EG is a great tool for consuming SAS metadata to do querying, reporting and analysis. Whilst EG can be used for basic ETL, when you need enterprise data integration capabilities, DI Studio becomes the better choice. It’s greater use of metadata allows for the management of much more complex ETL processes.
For detailed functional differences between EG and DI, check out the product user documentation. For SAS Enterprise Guide this is found within the product and for SAS Data Integration Studio click here.
Share your thoughts… What do you consider the difference between SAS Enterprise Guide and SAS DI Studio?
Never miss a BI Notes post!
Click here for free subscription. Once you subscribe you'll be asked to confirm your subscription through your email account. You email address is kept private and you can unsubscribe anytime.