SilverAge Software
Search And Replace. For Windows.
Use Cases
How do you create a CSV file with e-mails collected from multiple files?
Question
OK, using Text Workbench how do you collect e-mails from multiple files?
Answer
It depends on your source data. We shall consider two examples: a simple case, just e-mails; and a bit more complex example - collecting the e-mails and the person names.
Collecting The E-mails Only
As we need the e-mails only, we won't bother matching the A tags. The expression will be like this:
[\w\.\_\d]+\@[\w\.\_\d]+\.\w+
where [\w\.\_\d]+ matches all letters, dots, underscores and digits.
- Specify the folder in which you want to find files. For example:
C:\MyWebFiles - Set the mask(s) of the files you want to find. For example:
*.htm* - Switch on the regular expressions.
- Enter the search expression in the Find What field.
- Go to the Collect tab and tick the Collect... option.
- Specify the path to the file to which the found e-mails will be added, for example:
C:\MyWebFiles\emails.csv. - Set the Collected text option to Found text, and Text entry separator to New line.
- Click the Search button.
Collecting The E-mails and The Anchored Text (usually the contact person name)
In this case, we have to match the A tags.
The search expression will be:
\<a[^\>\<]#href\=\"mailto\:([\w\.\_\d\@]+)\"[^\>\<]@\>([^\<]#)\<\/a\>
here:
[^\>\<]#matches all extra information between the tag name and HREF attribute;([\w\.\_\d\@]+)matches and stores the e-mail address (first stored expression);([^\<]#)matches and stores the tag inner text (second stored expression).
It would be best if you copy this expression and play with it in Regular Expression Laboratory.
We will have to reformat the stored expressions to match the CSV format.
Please note that we can still use the replacement strings for collecting even when searching. So, in the Replace with field we specify:
\2,\1
here:
- \2 is the stored inner text of a tag (contact person name);
- \1 is the stored e-mail address.
- Specify the folder in which you want to find files. For example:
C:\MyWebFiles - Set mask(s) of the files that you want to find. For example:
*.htm* - Switch on the regular expressions.
- Enter the search expression in the Find What field.
- Enter the replacement expression in the Replace With field.
- Go to the Collect tab and tick the Collect... option.
- Specify the path to the file to which the found e-mails will be added,
for example:C:\MyWebFiles\emails.csv. - Set the Collected text option to Replacement text, and Text entry separator to New line.
- Click the Search button.