Ok, it seems to work except that the smart quotes stuff I have isn't actually matching what's in the source. In the cygwin shell the offending characters come up as "â?T" which is sort of meaningless.

Seems like a case of UTF characters...
If I pipe the output to a file and then open it in a UTF-capable text editor on my Mac then they come up as the normal smart characters.
If I save the awk file as UTF8 then it breaks when piped from the batch file.
Is there a proper way to be able to use UTF8 in bash and awk?
This of course also reminds me that I have to include accented characters as valid. I should have known this was going to get hairier...
