Monthly Archives: December 2020

Organizing Your Feature Files

“How to organize feature files?” was a question asked by Gojko Adzic in a recent blog. He presented several options and then asked for responses. The options included grouping them by user story, by capability and level of detail. This question has often come up over the past years in the workshops that I teach. 

For a small set of files, you can keep them in a single directory and use descriptive names for the files. For a larger set, a directory hierarchy is a common organization.  A network of files (using keywords to represent grouping) could also be created. Let’s look in detail at the hierarchy and network.   

Work Hierarchy

The overall hierarchy structure could represent the work items or the functionality.   With work items, the files are placed in a structure that parallels the sequence of when they are implemented.   The iterations or development cycles are used for the folder names.   This makes it easy to find the files associated with a story / work item.    

Iteration 7
   Story 171
   Story 895

Functional Hierarchy

With functionality, the hierarchy represents the user experience or the operational workflow – the behavior at a higher levels.  The folder names represent steps or sub-steps in the flow.  There could be separate folders for operations which are in common to multiple steps.   

Place an order
   Compute total order amount
      Compute tax
          No tax state 
          Tax exempt organizations

At the lowest steps in the structure, there could be either a single feature file or multiple files in a folder that represents a behavior.   This would be a small behavior that might have been created by a single story or by multiple stories.    

An alternative is to use a network form, each file having metadata which could be used to group the files.    For instance, tags could represent the groups.   This would be useful if there was not an operational hierarchy.    For example, feature files might contain:

@Order @Tax @Exempt
Feature: Determine organizations that are exempt from taxes

@Order @Tax @NoTaxState
Feature:  Determine states for which no tax should be applied

The files would be displayed in groups based on the tags – either single or multiple.    You could create multiple views of the same sets of files.  Those views could also represent a functional hierarchy. 


It doesn’t take too much effort to have files in a functional hierarchy. The advantage is that scenarios are now in the context of the flow.  Scenarios related to the same flow step are together.   A story that changes the behavior of an existing step may not create a new feature file, but just alter an existing one.    The living documentation represents how a system is used, not how it was created.  So it’s in the domain of the customer, not the implementer.

If a connection back to the stories is desired, you could use tags that identify the stories (e.g. @Story1234).    Each feature file would have one or more tags to stories which were the reason for its creation or change.

Decompose Scenarios for Simpler Scenarios

A blog question on relative dates by Gojko Adzic triggered a blog post by Seb Rose.   The two blog posts showed there are many shades of gherkin.  I’d like to use the example in those two posts to demonstrate a couple of facets of scenario decomposition. This uses a slightly different shade than Seb’s.


Let’s start with a scenario that might have been created during the discovery phase for this story. The example revolves around credit card transactions that reserve an amount of money.   If those transactions are not finalized by an actual charge within a certain amount of time, they are cancelled.   The scenarios might look like:

Scenario: Transaction aged by one month or less remains pending
Given a transaction received one month ago
When batch is processed
Then transaction remains pending
Scenario: Transaction aged over one month is cancelled
Given a transaction received over one month ago
When batch is processed
Then transaction is cancelled

These scenarios assume that the customer knows about the daily batch job or else the When might be reworded to a more customer-understood term.     


When the scenario from discovery is explored in the formulation phase, then additional scenarios may be created that capture more detail of behavior.  I differentiate between flow scenarios (like the one above) that have state (e.g. a transaction that must be created) and calculation scenarios where an algorithm or business rule yields a result.   Calculation scenarios tend to have more (or much more) detail.  

There could be some separation of behavior in the cancellation rule.  It could be split into a calculation of the differences in two dates and a determination of the actions based on the differences.   This separation allows for more re-use in different scenarios, just like separation in code.   If the date difference calculation was previously used in an application, then there should already have been a scenario/test for it.   If not, you collaborate on creating a scenario for it:

Scenario:  Calculation - Difference in Two Dates
* Difference in Months and Days between Date and Another Date
| Date        | Another Date | Difference    | Notes       |
| 20-Feb-2020 | 20-Mar-2020  | 1 month 0 day |  common     |
| 19-Feb-2020 | 20-Mar-2020  | 1 month 1 day |  “          |
| 29-Feb-2020 | 31-Mar-2020  | 1 month 0 day | previous month does not have the day|
| 29-Feb-2020 | 30-Mar-2020  | 1 month 0 day |  "          |
| 29-Feb-2020 | 29-Mar-2020  | 1 month 0 day |  “          |

I typically have a Notes column on every table.  It can explain why a particular set of values is being used.  That’s handy when you re-visit a scenario after enough time has passed that the reason has faded. 

I also suggest the description of the calculation be put in comments with the scenario.   Part of it might read:

#If the previous month does not contain the day, then use the highest day available

This calculation can get rather complicated and messy. Take a look at Seb’s blog for even more examples.   Notice that the calculation is asymmetric.    One month after 29-Feb-2020 is 29-Mar-2020.   Once the triad starts creating this scenario, the customer might alter the requirement to something simpler, like cancel “over 30 days”, unless there was a legal requirement for the complexity.   In that case, the calculation scenario should be reviewed by the subject matter expert to ensure that it meets that requirement.     

With the difference in dates separate, the cancellation rule can be stated as:

Rule:  Cancel transactions aged over one month
| Difference    | Category       | Action         |        
| 1 month 0 day | one_month      | Remain pending |
| 1 month 1 day | over_one_month | Cancel         |

Examples with data for a scenario could be:

| Trans Date  | Batch Date  | Category       | Action         |
| 20-Feb-2020 | 20-Mar-2020 | one_month      | Remain pending |
| 19-Feb-2020 | 20-Mar-2020 | over_one_month | Cancel         |

Note that this business rule uses the result of the difference calculation.   Each rule/calculation can be simpler, as each is deals with fewer of the details.   The examples of this rule utilize one of the examples in the difference calculation rule.   Note that the names of the categories should reflect the customer’s terminology.  

Let’s incorporate the example data of the cancellation rule into the flow scenario from discovery.

Scenario: Transaction aged by one month or less remains pending
Given a transaction received 20-Feb-2020  
When batch processed on 20-Mar-2020
Then transaction remains pending
Scenario: Transaction aged over one month is cancelled
Given a transaction received 19-Feb-2020  
When batch processed on 20-Mar-2020
Then transaction is cancelled


Note when actual data is incorporated, then the meaning behind that data selection is hidden. To keep both the meaning and the data together, I created a Gherkin preprocessor as an experiment.  It can be found at

With the preprocessor, you create names for values.   You use these names in the scenario.   When the feature file is processed, the names are replaced by the values.   For example:

#define BatchDate 20-Mar-2020
#define OneMonthAgo 20-Feb-2020
#define OverOneMonthAgo 19-Feb-2020
Scenario: Transaction aged by one month or less remains pending
Given a transaction received OneMonthAgo
When batch processed on BatchDate
Then transaction remains pending
Scenario: Transaction aged over one month is cancelled
Given a transaction received OverOneMonthAgo
When batch processed on BatchDate
Then transaction is cancelled

Notice how the scenarios appear close to the ones from discovery.  They are more abstract, but the test that is run contains specific data.  You could use whatever style you like for the #defines.  For example:  

#define batch_date 20-Mar-2020

would make the step into:

When batch processed on batch_date

The #defines might be changed if a different approach was needed for automation.  For example, if the date of the batch processing could not easily be set to a test date, then you might use:     

#define BatchDate Today()
#define OneMonthAgo TodayLess(1 month 0 day)
#define OverOneMonthAgo TodayLess(1 month 1 day)

Now the dates used in the automated test will be relative to today’s date.   But the scenario itself has not changed.   You could convert these symbols into the relevant values in the step definition, rather than use the preprocessor.  That would make the actual data used less transparent, which is a discussion for another article.  


Splitting a story into flow scenarios and calculation scenarios can simplify each of the scenarios.  The calculation scenarios may be re-usable in other scenarios.  The advantages of smaller scenarios can parallel the advantages of smaller methods in coding.   We’ll explore that in a subsequent post.