headline »


SAS VA Data Builder: Add a New Data Item

2016-02-05 – 9:25 AM

The SAS Visual Analytics Data Builder allows you to prepare and load data to the SAS LASR Server so you can complete analysis or build reports and dashboards. It’s a little more tricky to use than some of the other parts of SAS Visual Analytics but can usually be mastered within a day. If you need to add a data item to your data table, …

Read the full story »
BI Tools

Tips and tricks for building information maps, OLAP cubes, reports, and dashboards

BI Admin

Learn your way around a SAS BI installation.

Visual Analytics

Learn your way around the SAS Visual Analytics tool

Coding & Data

Extract, transform, and load your data into the SAS BI toolset

Stored Processes

Create and design stored processes like a rock star

BI Admin, Coding & Data, Data Visualization, Visual Analytics »

3 Methods to Remove Data from the HDFS VAPUBLIC Directory

2015-12-20 – 3:53 PM

When working with HDFS, you are often loading data to the various directories to upload into SAS Visual Analytics. However, sometimes you want to remove that data. Here’s three different ways to remove the data.

Method 1: Use the SAS VA Administrator Tools

If you have access to the Manage Environment area, you can use the Explore HDFS tab to interact with the data in the HDFS. You can select the data set name and the Trashcan icon to delete the data.


If you are using Cloudera or HortonWorks, you can use their tools to manage the HDFS. Here’s how you would do it from the HUE File Browser from Cloudera. Many SAS VA users find it’s easier to with Hadoop using one of these commercial vendor providers.


Method 2: Use the Command Line

Hadoop is based on the LINUX file system, so many of the commands that you use with LINUX you can also use with Hadoop to control the HDFS. For instance, if you want to remove list the files in a directory you would use the ls command.  It works the same for the HDFS. In this example, I do the following things:

  1. List the contents of the vapublic subdirectory.
  2. Move hps/c_orders_main to the vapublic directory.
  3. List the contents of the vapublic subdirectory.
  4. Delete the vapublic/c_orders_main data set.


What’s different about this command is that you type “hdfs dfs  -[comand] <path>” instead of just the command and full path names are used. So here’s the steps above repeated as HDFS commands.

  1. hdfs dfs -ls /vapublic
  2. hdfs dfs -cp /hps/c_orders_main.sashdat    /vapublic
  3. hdfs dfs -ls /vapublic
  4. hdfs dfs -rm /hps/c_orders_main.sashdat

You can find a refernce for the commands are listed at the Apache Software Foundation Hadoop site.

Method 3: Use SAS Studio

If you have already been using code to upload the data, then you can use code to delete the data. To make it even easier, you delete it using PROC DATASETS the same as working with an SAS library. Refer to the SAS documentation for more details about the DATASETS procedure.

In this example, I assign the VAPUBLIC directory to a library using the LIBNAME statement. The SASHDAT engine allows SAS to distribute the dataset into blocks across the data nodes. This is similar to how SAS works with any other library.

/*Assign the Library Contents */
 libname myHDFS SASHDAT
          host="yourserver.com"  install="/opt/sas/TKGrid"
 /*Delete the file from the library */
 proc datasets lib=myhdfs nodetails nolist;
      delete c_orders_main;

Here’s the log from the code above. You can see the library was assigned and then the dataset was deleted. Unlike with the command line method, you only list the data set name and not the extension.

59 libname myHDFS SASHDAT
60 host="myserver.com"
61 install="/opt/sas.com"
62 path="/vapublic";
NOTE: Libref MYHDFS was successfully assigned as follows:
Physical Name: Directory '/vapublic' of HDFS cluster on host 'myserver.com'

70 ! proc datasets lib=myhdfs nodetails nolist;
71 delete candy_adf;
72 run;

NOTE: Deleting MYHDFS.CANDY_ADF (memtype=DATA).

If you do know the host or TKGrid installation path, you can discover it by asking the SAS administrator. Another method is to check the code written by the Data Builder. You will have to create a temporary query that saves to the HDFS library, here’s an example.


These examples were created with the distributed SAS Visual Analytics 7.3.

Never miss a BI Notes post!

Click here for free subscription. Once you subscribe you'll be asked to confirm your subscription through your email account. You email address is kept private and you can unsubscribe anytime.

SAS VA: Dealing with Missing Dates

2015-12-20 – 1:27 PM

It’s nearing the new year and you may have already started preparing reports that have the organization’s goals and the progress toward those goals. However your chart may not be displaying correctly. If your data is not setup properly then you’ll notice that you only have one or two months when you want to see the entire year even if there is no data yet. …

Adding a Stored Process to SAS Visual Analytics

2015-12-01 – 7:04 AM

Not all data belongs in SAS Visual Analytics – it’s true. You may have situations where you want to filter and zoom on data and then look at the data in another system. Maybe it’s a list of items that the user wants to follow-up on in a different method. If you are using a non-distributed version of SAS Visual Analytics you may be particular …

Oh Snap! Upload Data to the LASR Server Just Like That!

2015-10-24 – 7:55 AM

In a past life I worked at a company who had an excellent general manager, she was professional, intelligent, and a role model. It was a small company competing with giants and kicking their butt successfully. She set a new mission for the organization –  we needed to not only meet a customer’s product needs but also delight them in the process. Its hard to …

3 Quick Steps for Using External Links to Enrich Your SAS VA Report

2015-10-04 – 3:17 PM
sas visual analytics twitter report

If you read my Thanks for the Negative Tweets Josh post on LinkedIn then you know that I’ve been working on some Twitter reports in SAS Visual Analytics. My goal is to have a set of reports that I can use for demos, training, blog posts, web-inars, and even eBooks.  In my current draft the biggest challenge is determining what Twitter data tells an interesting story and provides …

Visual Analytics: 3 Tips for Brilliant Bubbles in Your Geospatial Data

2015-09-13 – 11:30 AM

I like bubble charts – even though they get a lot flack for being difficult to understand. These charts can pack a lot of data into a few variables. I agree the charts can be difficult for layperson to understand but it doesn’t seem true for the geo bubble maps. Possibly it’s because the user sees the map and understands its related to location. Here’s …