Home » Enterprise Guide

SAS Enterprise Guide Process Flows: The Good, The Bad, and The Ugly

Submitted by on March 13, 2013 – 9:30 am 18 Comments

SAS Programmer meets SAS Enterprise Guide Process Flows

When an old school SAS programmer meets Enterprise Guide, I suspect often the first questions are along the lines of “what’s a Project?” “what’s a Process Flow?” and of course “Why do I need them, can’t I just code???”  The honest answer is yes, you can open Enterprise Guide, open a code window, and just code.  But I’ve found that process flows are useful to me as a programmer, even though I use them in a non-traditional way.

The “Bad”

First, process flows are not really “bad.”   But if you like to write code, process flows can seem like annoying window dressing at first glance.  More importantly, I’ll do anything for a blog post theme, and I love a good Spaghetti Western.

Many introductions to Enterprise Guide start with an explanation of a process flow. As shown below, a process flow is a series of tasks (code modules) which will be executed in series.

ProcessFlow

The above process flow starts with a source dataset, PRDSALE, then the Summary Statistics task creating a summary dataset, and finally the Bar Chart task creating a chart. Each task node of process flow is SAS code generated by Enterprise Guide. The Summary Statistics task is PROC MEANS. The Bar Chart task is PROC GCHART. The arrows connecting the nodes are NOT SAS code. They are defined as part of the Enterprise Guide project file, but are not represented in the code. If you want to be able to run your code outside of Enterprise Guide, you can’t rely on process flows to define the order in which code should be executed. (Actually to be fair, there is an export utility in Enterprise Guide which will convert a process flow, including the connections, to a SAS program for you.)

Modular programming in SAS is a good idea. SAS programmers may employ a variety of methods to support modular programming, without relying on Enterprise Guide node connectors. One approach is to have a driver program, driver.sas, which invokes other SAS programs via %include calls. Another approach is to use the macro language to define and invoke modules. (For more on structured programming in SAS, see this excellent paper by Ed Heaton.) These methods allow you to develop a logical process flow (i.e. code modules/nodes executed in sequence, perhaps with conditional logic controlling which modules execute), without requiring that code be executed in Enterprise Guide.

The Good

When I said that process flows are “bad”, what I meant was that I don’t like the connections in process flows, because I like to write those connections myself in SAS code. And I don’t like clicking menus to generate nodes of SAS code, because I find it easier to just write SAS code myself. And I don’t know (and don’t really want to learn) how to debug a process flow, because I’m more comfortable debugging SAS code.  But I love that process flows provide a great visual interface to my SAS code.

Suppose I’m working on a stored process which generates a simple report. I might end up a process flow which looks like:

QProcessFlow

The process flow serves as a desktop with links to everything I need for a project. The first icon is the stored process. This is one of my minimalist stored processes, so it %includes RunReport.sas, which in turn %includes GetData.sas, MakeTable.sas, and MakePlot.sas. If I want to update the stored process itself (change which server it executes on, or update the prompts), I open the stored process. If I need to modify the plot, I open MakePlot.sas. The actual .sas files are sitting in our standard directory structure. Every icon in the process flow is a link. I often add links to secondary items in the process flow as well. These might be links to code samples, utility macros, descriptive notes, even non-SAS files such as Word documents with data dictionaries, or meeting minutes.

I love using process flows for this sort of organization.  Over time I’ve become a big fan of modular programming.  I have grown so fond of using an Enterprise Guide process flow as a desktop providing easy access to all of my shortcuts for a project, that now when I am using DM SAS it sometimes feels like a hassle when I have to open a code module (File->Open->navigate through windows dialog box… ugh).

Multiple Process Flows

I recently learned that a single Enterprise Guide project file can contain multiple process flows.  When I don’t have access to separate servers for development, test, and production, I sometimes use directory structures to approximate these environments.  In a /Project directory, I might have Project/Dev/Code, Project/Test/Code, and Project/Prod/Code.  I’ve been playing with having separate process flows named Dev, Test, and Prod.  With this setup, I can easily navigate  among the “environments.”

EGMultipleProcessFlows

The Ugly

Much as I’ve come to appreciate process flows, there is one ugly aspect to how I use them.  I start with a pretty process flow like above, with all my links laid out precisely in some visual organizational scheme.  Then when I run code, Enterprise Guide adds new icons for the data sets created, output files, and log files.  And my beautiful process flow becomes uglified by those icons and connectors:

 

ProcessFlow_expanded

These extra icons aren’t a problem, but I wish there was a way to keep them from being added.  Actually, while writing this post I realized I can turn off generation of icons for output files by unchecking Tools->Options->Project Views->Show generated results.  If anyone knows a way to turn off creation of icons for created datasets, please let me know. 

What’s Your Process?

If you’re a SAS programmer using Enterprise Guide, do you have a favorite tip for developing or utilizing process flows?  If so, please drop a note in the comments.

<

Never miss a post!

Get the latest BI Notes post in your Inbox when a new post is released! Click here for free subscription. You email address is kept private and you can unsubscribe anytime. Go ahead ... join us!

The following two tabs change content below.

Quentin McMullen

Quentin McMullen has been programming in SAS for 15 years, and for the past year has been working on SAS BI projects. He has presented at national and regional SAS user group conferences, and can often be found corresponding with colleagues on SAS-L.

Tags:

18 Comments »

  • Quentin McMullen says:

    Thanks Chris,
    Setting it to 0 does the works for me. Didn’t know about auto-arrange either. Will give it a shot.
    –Q.

  • Quentin McMullen says:

    Thanks much Bobby,
    A partial solution is much better than no solution! : )

  • RoseAG says:

    Absolutely!

    We have development/production areas and I used to worry about executing something in production by mistake until I started doing this.

  • Chris U says:

    There’s an option that can reduce the number (or set equal to zero) the number of data sets that get added to the process flow. I think I have mine set at 5. Once I reach that, I have to browse to the library and right click -> Add to Project to get it into the process flow.

    Also, right clicking on the process flow and unchecking ‘auto arrange’ allows me to de-uglify the process flow once I’m done adding all of the nodes (particularly, keeping arrows from crossing over one another all over the place).

  • This is what I love so much about the SAS community – if you don’t know how to do it – someone else surely does and will help you figure it out.
    Let us know any other tricks you find out Bobby!

  • Bobby Davison says:

    ” If anyone knows a way to turn off creation of icons for created datasets, please let me know.”

    Well there is a partial solution…

    Go to the Project Properties and select “Output Data Sets”. specifying a value of zero will suppress the display of all dataset icons, however, it will display a note attached to each task with the title “Data Set Limit Reached” and detail in the note saying something like “There were 50 data sets created, however only the first 0 data sets were added to the project.”

    HTH

  • Quentin McMullen says:

    Thanks for the comments David!

    I think we’re roughly on the same page.

    Maybe I should have described my use of non-connected process flows (non-flowy flows?) as “unusal” or “alternative” rather than “non-traditional”?

    And agree with you when you say that for SAS novices and business users, the “window dressing” is the point. And also agree that process flows were not necessarily designed to illustrate traditional SAS programming techniques (i.e. controlling code modules via %includes or macro language rather than as connected tasks).

    What you see as me trying to squeeze my approach to code development into a new paradigm of EG, I see as me taking a new tool (EG) and trying to figure out how it can improve my productivity. Since I’ve got years (not as many as you : ) of SAS programming experience, my interest is mostly in figuring out how EG will help leverage that experience. For me, that means figuring out how EG can help me program better.

    But there are many folks, like the author of the paper I cited in my introductory EG programming post, who feel differently, and suggest SAS programmers should consider switching completely from writing SAS code themselves, to generating code via EG.

    To me, that’s the beauty of a tool, there are many different ways to use it.

  • David Birch says:

    Quentin,
    It’s not so much that you’re using process flows in a non-traditional way (EG has only a short history of wide-spread use, so there’s no real traditions), as that you appear to be trying to squeeze your personal ‘tradition’ of code development into the new paradigm of EG.

    I’ve been using SAS for 30 years now, and EG for 7 years. When DMS, FSEDIT and later on ViewTables were first introduced, I encountered a lot of established SAS programmers who couldn’t adapt easily to the new levels of interactivity – and even to not defaulting everything to upper case!

    The Process Flow is meant to be a visual representation of the data from inputs through to outputs effected by processing tasks, and frequently involving many intermediate results and tasks. I.e. the “window dressing” is the point. SAS novices and business users really like it.

    Modular programming isn’t what process flows are designed to illustrate. It is supported within code nodes and enhanced by the introduction of stored processes – particularly when parameterised and using prompts. Personally, I often have a “code utilities” Process Flow, which (like the above example labelled “Good”) isn’t actually a ‘flow’.

    I also usually have the generation of icons for the task log turned off. I don’t consider a log to be an ‘output’ or a ‘result’, but to be an attribute of a submitted task. Right-clicking on the task to access the log may take some getting used to for programmers used to the log always being open and visible, but it usually doesn’t take long.

    Another tip to “beautify” your projects, is to break your overall process into multiple flows. Eventually, you may realise that projects that have been “beautified” are also better organised and simpler to understand. It’s an element of ‘refactoring’ that I would recommend.

  • Quentin says:

    Thanks again Chris, especially for the links. I only started reading your blog last year. I guess it’s time for me to start reviewing your archives!

  • Adding the links helps with readability (so you can see what flows into what) and helps to enforce the sequence when you run it in EG. This post has more tips.

    I don’t think it takes away from your “modular” program approach — so I don’t see a downside from that perspective, unless you mean to “reuse” a code item in several flows and you find that adding links gets you all tangled up.

    Another construct you can use that helps with reuse: the Ordered List.

  • I must be the only person who never really looks at the process flow. I get frustrated because the Projects area gets so unruly. :-)

  • Fareeza says:

    That is a great suggestion!

  • Quentin McMullen says:

    Thank you Michelle!!! Love that tip, and I’m sure it will help save me from myself.

  • Quentin McMullen says:

    Oh yeah, I’ve ended up with many more spaghetti lines all over my process flow. Sometimes it starts ugly, and then you get so many lines it can become pretty again (in an ink-blot sort of way : )

  • Quentin McMullen says:

    Thanks Chris,
    So when you decide to manually link code sections, is it because you are deciding that you will always run that code in EG? Part of my inclination to avoid using the links was so that I could develop (modular) code in EG, but then run it as a stored process, or just as an old-school batch job submitted in a putty terminal window to run on our linux server (more on that to come in my next post….)

  • And when working with multiple process flows it may also be handy to change the background colour, particularly if you want to visually distinguish a particular process flow. For example, if all process flows had a white background you may accidentally make changes to your PROD process flow instead of DEV so a glaring black background colour could save you… :-)

    To change the background colour, simply right mouse click on the process flow, select Background Color and then choose the colour you want.

  • Fareeza says:

    That’s one reason I liked EM, it didn’t show the datasets.

    You’ve demonstrated the UGLY with a few datasets but realistically, some of my procedures end up generating 50 tables from one procedure. That is really ugly.

  • Quentin,

    Like you, I have several “code-only” projects that I use. However, I like to add explicit Links (right-click, Link To — or “draw” a link) to show (and enforce) the proper sequence on the process flow.

    Also, the extra “gunk” that EG adds when you run a flow is sort of EG’s way of showing off what it knows about your results. You can also use the Project Reviewer task to create some useful reports about how your project ran.