Stripping Non-ASCII Characters within Macro
with a pesky en-dash issue (likely related to the transcoding between SAS & SQL Server) I discovered today there was no ‘in-built’ way to remove non-ascii (or extended-ascii) characters within SAS.
There is a great SUGI paper about this topic (here) but the approach required the use of a data step. Let me save you some fiddling around if you need this as a macro capability, with the extract below.
%local i asciichars;
/_ adjust here to include any additional chars _/
%do i=32 %to 126;
%let asciichars=&asciichars%qsysfunc (byte(&i));
/_ store in macvar for efficiency _/
* Example usage within macro language
%put %sysfunc(compress(my – endash,&ascii_chars,k ));
* Example usage within data step
str="goodbye •–—˜™š›œžŸ ¡¢£¤¥¦§¨©ª«¬®¯°±²³´µ¶· nasties" ;
The main gotchas were as follows:
- The characters in byte(3,4,5,12,13) do funny things in macro (open code recursion etc)
- It is not advisable to reference rank() above 127 as this extended set can vary country to country (the byte # may not be the same as the rank #)
- The 32-126 range includes apostrophe and single quote, and thus they need to be handled appropriately!