HandyFile Find And Replace: Text Workbench Online Help Submit feedback on this topic   

HOWTO: Collect Any Text from Multiple Files In A Single File

Sometimes you need to not just find some portion of text in files, but also store the found text in a single file. HandyFile Find and Replace is capable of performing the collector function, allowing to collect either the found text, or the result of applying a regular expression to the found text block - in other words, the replacement text, even in the Search mode.

We shall consider a rather complex example of searching the directory for HTML files and extracting the head tags from HTML files - that is, we shall create a linked table of contents.

The Problem

We expect receiving a file with a table of contents made up using the following rules.

The Solution

  1. Specify the folder that contains the HTML files you want to create the contents for. For example: C:\MyWebFiles\MyBook
  2. Set mask(s) of the files that you want to find. For example: *.htm*
  3. As we want to search for irregular text blocks, we should use Regular Expressions. Enable them by checking the corresponding option.
  4. Phase 1. Bookmark Headings

  5. Now we have to construct a search expression. The best way to accomplish this is using the Regular Expression Laboratory. It allows you to provide any sample text, enter an expression and see how it works. 

    For being short, we shall omit the construction procedure. The expression that will match any heading tag with or without attributes will look as follows:

    \<(H[1-6])(.@)\>([^\<]#)\<\/H[1-6]\>
      !        !     !
    expr1    expr2  expr3
  6. The Collector features using the replacement expressions even if simply searching for text. Should we simply collect headings, we could manage without any modifications. But before we collect the contents entries, we need to insert bookmarks in headings, so we shall now perform replacement operation.
  7. The replacement operator could be:

    <\1\2><a name="#\R">\3</a></\1>
  8. Click the Replace button. Now we are through with tagging headings.
  9. Phase 2. Generate the Contents

  10. Go to the Collect tab and check the Collect... option. Set the path to the collector file: C:\MyWebFiles\toc.html. Set the Collected text to Replacement text, and Text entry separator to New line.
  11. The search expression is:

    \<(H[1-6]).@\>\<a name\=\"(.#)\"\>([^\<]#)\<\/a\>\<\/H[1-6]\>
      !                        !         !
    expr1                    expr2     expr3
  12. The replace expression that in fact will form each contents entry, is:

    <div class="toc\1"><a href="/pch:"c:/mywebfiles"#\2">\3</a></div>
  13. Click the Search button. After the contents file is generated, you can use it as a starting point to creating a richer web page.