SilverAge Software

Search and Replace. Edit. Transform.

Use Cases

How do you create a CSV file with e-mails collected from multiple files?

Question

OK, using Text Workbench how do you collect e-mails from multiple files?

Answer

It depends on your source data. We shall consider two examples: a simple case, just e-mails; and a bit more complex example - collecting the e-mails and the person names.

Collecting The E-mails Only

As we need the e-mails only, we won't bother matching the A tags. The expression will be like this:

[\w\.\_\d]+\@[\w\.\_\d]+\.\w+

where [\w\.\_\d]+ matches all letters, dots, underscores and digits.

  1. Specify the folder in which you want to find files. For example: C:\MyWebFiles
  2. Set the mask(s) of the files you want to find. For example: *.htm*
  3. Switch on the regular expressions.
  4. Enter the search expression in the Find What field.
  5. Go to the Collect tab and tick the Collect... option.
  6. Specify the path to the file to which the found e-mails will be added, for example: C:\MyWebFiles\emails.csv.
  7. Set the Collected text option to Found text, and Text entry separator to New line.
  8. Click the Search button.
Collecting The E-mails and The Anchored Text (usually the contact person name)

In this case, we have to match the A tags.

The search expression will be:

\<a[^\>\<]#href\=\"mailto\:([\w\.\_\d\@]+)\"[^\>\<]@\>([^\<]#)\<\/a\>

here:

It would be best if you copy this expression and play with it in Regular Expression Laboratory.

We will have to reformat the stored expressions to match the CSV format.

Please note that we can still use the replacement strings for collecting even when searching. So, in the Replace with field we specify:

\2,\1

here:

  1. Specify the folder in which you want to find files. For example: C:\MyWebFiles
  2. Set mask(s) of the files that you want to find. For example: *.htm*
  3. Switch on the regular expressions.
  4. Enter the search expression in the Find What field.
  5. Enter the replacement expression in the Replace With field.
  6. Go to the Collect tab and tick the Collect... option.
  7. Specify the path to the file to which the found e-mails will be added,
    for example: C:\MyWebFiles\emails.csv
  8. Set the Collected text option to Replacement text, and Text entry separator to New line.
  9. Click the Search button.