Exercise 2.3 Feature Caching
Data 3-1-1 case location details (XLS hosted on FTP)
Overall Goal Inspect data
Demonstrates Inspect data using feature caches
Start Workspace C:\FMEData2018\Workspaces\IntroToDesktop\Ex2.3-Begin.fmw
End Workspace None

Let's use feature caching to inspect our data.

1) Start Workbench

Start Workbench (if necessary) and open the workspace from Exercise 2.1. Alternatively you can open C:\FMEData2018\Workspaces\IntroToDesktop\Ex2.3-Begin.fmw.

2) Use Feature Caching

Feature caching is turned off by default; you can see the magnifying glass and small play button icon in the top left of your toolbar:

Click the button to turn feature caching on, or go to Run > Run with Feature Caching. Then rerun your workspace. You should notice a little green square with a magnifying glass filling up on your reader feature type:

This icon represents a feature cache. The cache stores features at that point of the translation, and you can click it to inspect the data. Click the green icon to inspect the data at that point of the translation; your original Excel data will open in Data Inspector.

3) Use Partial Runs with Feature Caching

Feature caching also allows us to selectively run sections of our workspace.

If you select your reader feature type by clicking it, you will see some tool-tip icons appear above it. Here you have two partial runs options. Because this feature type is the first object in the workspace, you can choose to Run Just This, which in this case reads the data and caches it. You can also choose to Run From This, which will run from this object to the end of the workspace, running only what it is connected to.

If you mouse over each option, you will see the objects that will run if you click are highlighted. For Run From This, just the reader feature type is highlighted:

For Run From This, the writer feature type will also run:

Finally, you can select the writer feature type and see the option Run To This, which will run all connected objects preceding this one:

In this specific scenario these tools are not that useful, since we only have two objects in our workspace. However, as our workspace grows, partial runs help one test and develop small sections of your workspace at a time. For now, use Run Just This on your reader feature type. Then, click the green icon to inspect your read data.

4) Sort Table View to Check for Missing and Inconsistent Values

Now that Data Inspector is open, let's use its Table View to identify problems in our source data.

The column headers in Table View contain the names of each attribute. If you right-click them you get a context menu with many options: to sort the column alphabetically (ascending or descending) or numerically (ascending or descending), or to clear all sorting:

Since ultimately we want to report on 3-1-1 calls by local planning area, let's see if any of our features are missing values for the Local_Area attribute. To do so, scroll over in your Table View until you see the Local_Area attribute. Then, right-click and select Sort Alphabetically Ascending:

If you scroll down, you will find there are 4,122 features with missing values for Local Area!

If you take the time to look through, it turns out we have two kinds of missing data:

  1. Some records do not have street addresses, leading them to not be associated with a local area.
  2. There are also two rows of missing data in between each month of records. You can verify this by scrolling down to find the gaps between each month (where the Month attribute changes from 1 to 2, 2 to 3, etc.).

If you have a keen eye for data problems, you might also spot inconsistent naming for the values in the Local Area attribute. Two of the area names have some values with a hyphen in some of the values, but not all: Dunbar Southlands and Dunbar-Southlands and Arbutus Ridge and Arbutus-Ridge:

You can verify this problem by using the search bar in the bottom-left of Table View. Click the drop-down and select Local_Area:

Compare the results (visible in the bottom right) of typing in "Arbutus" and "Arbutus-" in the text box with the magnifying glass. You'll find that of the 2,236 features in the Arbutus-Ridge local area, 116 have a hyphen between the words:

We will address both these data quality issues in a later exercise.


TIP
This kind of visual inspection of data is a good first step, but more thorough data validation is usually required. As this is an introduction, we won't go into details regarding validation. If you want to learn more about data validation with FME, check out this article.

CONGRATULATIONS
By completing this exercise, you have learned how to:
  • Use FME Workbench feature caching to open a dataset in the FME Data Inspector
  • Sort data in the FME Data Inspector Table View

results matching ""

    No results matching ""